Re: [AMBER] sander.MPI

From: Fabian Glaser <fglaser.technion.ac.il>
Date: Tue, 18 Dec 2012 11:20:57 +0200

Hi Jason,

The system people asked me to ask you about the error produced when using more than one node. Here is the error:

HYDU_create_process (./utils/launch/launch.c:94): execvp error on file 3SO6_clean_prod_3.rst (No such file or directory)

So that is one of the nodes does not find the output file?

What I am doing wrong?

Thanks!

Fabian


_______________________________
Fabian Glaser, PhD
Bioinformatics Knowledge Unit,
The Lorry I. Lokey Interdisciplinary
Center for Life Sciences and Engineering

Technion - Israel Institute of Technology
Haifa 32000, ISRAEL
fglaser.technion.ac.il
Tel: +972 4 8293701
Fax: +972 4 8225153

On Dec 16, 2012, at 4:48 PM, Jason Swails wrote:

> On Sun, Dec 16, 2012 at 2:21 AM, Fabian Glaser <fglaser.technion.ac.il>wrote:
>
>> Hi,
>>
>> I am using the following PBS file to run sander
>>
>> #PBS -l select=1:ncpus=12:mpiprocs=12
>> ...
>> mpirun -hostfile $PBS_NODEFILE pmemd.MPI -O -i prod.in -p
>> 3SO6_clean.prmtop -c 3SO6_clean_prod_1.rst -o 3SO6_clean_prod_2.out -x
>> 3SO6_clean_prod_2.mdcrd -r 3SO6_clean_prod_2.rst
>>
>> Which runs perfectly, at a rate of about ns/day = 3.67
>>
>> But if I try to use more than one node, for example:
>> #PBS -l select=2:ncpus=12:mpiprocs=12
>>
>> The job does not seem to start or at least output files are not writte...
>>
>> Is there a way to use more than one node? Or any way to accelerate the
>> process?
>>
>
> I have never had problems running on multiple nodes. Given the
> communication required on each step, though, unless your nodes have a fast
> interconnect (e.g., some type of infiniband) you will be better off just
> using 1 node if each node has 12 cores available, IMO.
>
> If you're really having a problem running on multiple nodes, the issue is
> probably somewhere in your system configuration or your MPI installation.
> Some systems may require you to set up password-less login between nodes
> using an ssh-key, since multi-node jobs need to send information between
> nodes. Since we just use the MPI API, the problem is highly unlikely to be
> Amber.
>
> I would suggest contacting your system administrator for this cluster with
> the problems you're having. If you want to test inter-node MPI with a very
> simple program, try running something like this:
>
> mpiexec -hostfile $PBS_NODEFILE $AMBERHOME/test/numprocs
>
> Which should just output the total number of processors you asked for to
> the PBS output file (#PBS -o <pbs_output>)
>
> Good luck,
> Jason
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Dec 18 2012 - 01:30:03 PST
Custom Search