RE: AMBER: REMD and mpiexec

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 6 Jun 2007 09:06:06 -0700

Hi Steve,

With the REMD runs if you specify 8 groups then you need to specify either 8
threads, or 16 threads, or 24 threads etc... The number of mpi threads you
run must be a multiple of the number of groups you are requesting. This
doesn't necessarily mean that you need this many processors. For example you
could run 8 threads on a single dual core machine - for x86 type chips this
doesn't hurt you too much. I.e. you don't pay too much price for the time
slicing and you get about 25% performance for each thread. However, on some
architectures namely IA64 and Pwr4/5 this really really hurts since the
overhead for swapping threads is huge.

So from the error message below it would appear that the number of threads
you requested was not a multiple of th enumber of groups. If it was then
please post a message back and we can try to debug it further.

All the best
Ross

/\
\/
|\oss Walker

| HPC Consultant and Staff Scientist |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
| http://www.rosswalker.co.uk | PGP Key available on request |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

> -----Original Message-----
> From: owner-amber.scripps.edu
> [mailto:owner-amber.scripps.edu] On Behalf Of Steve Young
> Sent: Tuesday, June 05, 2007 17:53
> To: amber.scripps.edu
> Subject: AMBER: REMD and mpiexec
>
> Hello,
> We have a beowulf cluster that is running
> torque-2.0.0p7 (PBS), RedHat
> Enterprise 4, Amber9 and mpich2-1.0.5. We've had a heck of a time with
> using mpiexec vs. mpirun when trying to run different aspects of
> sander.MPI.
>
> Some history, if we use mpirun (with or without the queue (PBS)
> system) sander.MPI runs as expected. We get good output with
> no errors.
>
> However, there is one major issue. The nodes that PBS allocates end up
> not always being the nodes the job runs on. This is a problem since
> mpich seems to manage job allocation with mpirun itself. So in posting
> to the mpich-discuss list I found out I needed to use the mpiexec
> program within the mpich distro. It also turned out I needed
> the version
> from OSC that works with torque. After installing OSC mpiexec, we ran
> some normal sander.MPI jobs and received output as expected. So now we
> start to test out some Replica Exchange jobs that we've run on other
> clusters.
>
> Here is some of my post from the mpich-discuss listserv:
>
> <.... snip ...>
> Ok so I got the OSC version of mpiexec. This appears to work very well
> running normal sander.MPI. Requesting 16 cpu's we verify good
> output and
> near 100% utilization of all 16 processes. Now the next thing
> we want to
> use is another part of Amber called Replica Exchange. It basically is
> different arguments to the sander.MPI program. When I run this part of
> the program I end up with the following results:
>
> Error: specified more groups ( 8 ) than the number of
> processors (
> 1 ) !
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> Error: specified more groups ( 8 ) than the number of
> processors (
> 1 ) !
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> Error: specified more groups ( 8 ) than the number of
> processors (
> 1 ) !
>
>
> Now I realize I should be posting to the Amber list as this appears to
> be an Amber related problem. I myself would tend to believe that. But
> what I can't explain is why when I change to the original version of
> mpirun that the program runs fine using the exact same files.
>
> <.... snip....>
>
>
> So does this mean that Amber9 isn't working properly with the OSC
> version of mpiexec? Which combinations of amber and mpi work best on a
> torque beowulf cluster? Thanks in advance for any advice.
>
> -Steve
>
>
> --------------------------------------------------------------
> ---------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>


-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sun Jun 10 2007 - 06:07:12 PDT
Custom Search