Re: [AMBER] sander.MPI running only on one node in parallel mode ?

From: Jason Swails <jason.swails.gmail.com>
Date: Sat, 30 Jan 2010 11:55:02 -0500

Hello,

On Sat, Jan 30, 2010 at 10:20 AM, MUHAMMAD IMTIA SHAFIQ
<imtiazshafiq.gmail.com> wrote:
> Dear All,
>
> I am submitting my sander.MPI job with qsub using the following script
>
> #!/bin/sh
> #$ -S /bin/bash
> #  This is file Run__amber
> #PBS -l nodes=4:ppn=4
> #PBS -V
> #PBS -N Imtiaz
> #PBS -l walltime=03:59:00
> cd $PBS_O_WORKDIR
>
> echo "simulation started at" `date`
>
> /opt/lam/7.1.3/bin/lamboot
>
> /opt/amber10/bin/sander.MPI -O -i heat.in -o heat.out -p SAMP.prmtop -c min.rst -r heat.rst -x heat.mdcrd -ref min.rst
>
> echo "simulation end at" `date`
>
>
> When I submit the job with qsub I get a job ID and showq indicate that job is running on 16 processors and 4 nodes.
>
> 414013                mis9    Running    16  03:55:33  Sat Jan 30 00:58:54
> 1 Active Job       16 of   36 Processors Active (44.44%)
>                               4 of    9 Nodes Active      (44.44%)
>
> When I noticed the this job is taking very long time then an average estimate I used ssh to each note and run top command. I found sander.MPI is running only on one node.
>
> What could be the potential reasons that despite job is submitted to 4 nodes and showq also show that it is running on four nodes but in reality it is taking unnecessarily very long time and top command on each individual nodes shows that sander.MPI is actually running only on one node ? Do I need to change or submit some environmental variables in the job submission script

This is rather an expected result. Typically, MPIs require some type
of wrapper or script that starts a given number of threads.
Therefore, while your submission script knows that it has 16 nodes
available to it, you've given sander.MPI no way of knowing how many
threads it should start, so it simply starts one. You should check
with your system admin exactly how to start a multi-thread MPI
program, but typical ways are:

mpirun -np 16 sander.MPI ... etc., or
mpiexec -n 16 sander.MPI ... etc., (or in some cases, mpiexec
automatically starts using the PBS machinefile so it knows how many
nodes to start threads on, so the command is simply
mpiexec sander.MPI ... etc.)

However, how exactly this is done is installation-dependent.

Good luck!
Jason

-- 
---------------------------------------
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Graduate Student
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jan 30 2010 - 09:00:04 PST
Custom Search