Re: Problem of running sander on Linux cluster

From: David Konerding <dek_at_cgl.ucsf.edu>
Date: Mon 19 Mar 2001 08:22:31 -0800

"chomsri upkaew" writes:
>Dear Ross Walker and netters,
>
>I can compile sander with parallel on Linux cluster as Ross's suggestion.
>However, I have some problem.
>
>My Linux cluster contains 12 boxes and 2 processors in each box (total 24
>processors). When I submitted my job on box2 (or other boxes which are not
>box1) with command
>
>% mpirun -np 2 $AMBERHOME/exe/sander -O -i input -o output -c ...
>
>I want to run 2 process on the same box (box2).I found that 1 process was
>running on box2 and another process was running on box1. I don't know why?
>Could you please give my the suggestion?

This is really beyond the ken of the AMBER mailing list, I suggest that
you read the MPICH manual carefully to learn why your jobs run in
particular locations.

The short answer is that by default mpirun will use the machines listed
in the MPICH machines file. When you compiled MPICH you compiled it on
box1, and the machines file (usually located in
$MPICH_HOME/share/machines.LINUX) lists box1 several times. This is so
when you compile and install you can quickly test on your local machine
whether things are working (and if you have multiple processors, it
will actually run "faster"). But, when you started the job on box2,
the first node will be started locally (IE on box2). This behavior can
be overriden using the "-nolocal" flag to mpirun. If you had done
that, all the processes would be running on box1.

To achieve what you want (specification of what nodes the job should
run on on a per- job basis), you want to create a machines file in yor
running directory, and point mpirun at that instead of the default
machines file. Create a file caled machines and fillit with:

box2
box2

then, run mpirun as:

% mpirun -nolocal -machinefile machines -np 2 $AMBERHOME/exe/sander -O -i input -o output -c ...

(the nolocal is in there in case you're on box1 and you want the job processes to run on box2).

Dave
Received on Mon Mar 19 2001 - 08:22:31 PST
Custom Search