spookie wrote:
> well, thanx lubos..the link you gave was
> informative..but in the third thread as you mentioned,
> you said that the modification of code pertains to
> running jobs using PBS and mpiexec..but, i'm trying to
> run my parallel jobs using mpirun and without pbs..i'm
> actually trying to run the default benchmark given by
> the amber8 distro, using mpirun...my DO_PARALLEL
> variable looks like this..
>
> DO_PARALLEL="mpirun -np 2 -machinefile
> $MPICH_HOME/share/machines.LINUX"
you can use the same approach even without mpiexec and PBS... just the
switches will probably differ:
mpirun is adding some more switches to the command line automatically
(it's hidden from the user). in the routine mpi_init() (i'm not sure
about the name) that is called at the very beginning of the parallel
simulation, all these added switches (and also -np, -machinefile, ...)
should be processed and removed from the variables containing parameters
=> only these really intended for the sander itself should persist and
be passed for sander processing. in my case there was a problem, that
not all the mpi-related switches were removed and these redundant
parameters caused sander to quit. i think you are in the same situation
- so you should just modify your sander so, that it automatically skips
all of these...
you should be able to find what these additional parameters mean and
whether they are only switches or whether they require another value
(have a look on the webpage of your mpi implementation). then you can
just modift sander/mdfil.f file in the regions that are tested when MPI
is defined...
> i observed another strange thing today..as we know, a
> parallel job creates a temporary file with the process
> id of sander as its name and lists all the nodes on
> which the job is being run...till amber7, sander used
> to accept nodes in powers of two..but now, my
> DO_PARALLEL variable started accepting any number of
> nodes irrespective of it being a power of two or
> not....i'm wondering if it is a modification in the
> source code that has been brought about in amber8 or
> it is a compilation error on my cluster !!
from my experience (and i recall also reading it somewhere here), it's
no longer necessary (pmemd, amber8) to use number of processors that are
power of 2... so this is ok.
regards,
--
Lubos
_._"
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Sat Jun 26 2004 - 15:53:01 PDT