Re: [AMBER] Problem in parallel version of pmemd.cuda from Sanjib Paul on 2014-05-20 (Amber Archive May 2014)

From: Sanjib Paul <sanjib88paul.gmail.com>
Date: Tue, 20 May 2014 18:56:59 +0530

Hii,
      I compiled under username named 'software', and tried to run
pmemd.cuda.MPI under 'myaccount'. In 'software' when i give command 'which
mpirun', it returns '/usr/mpi/gcc/openmpi-1.6.4/bin/mpirun'. But when I
give the same command in 'myaccount', it returns '/usr/local/bin/mpirun'.
So it is clear I am using two different mpirun under two different
username. So now i tried to run pmemd.cuda.MPI in 'myaccount' giving the
following command -

/usr/mpi/gcc/openmpi-1.6.4/bin/mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI
-O -i mdin -o mdout -p prmtop -c inpcrd -r rst -x mdcrd

Here pmemd.cuda.MPI is running in some cases, and stopping giving some
error messages in another cases.
>From same prmtop and rst file I tried run pmemd.cuda.MPI under NVT
ensemble and NPT ensemble using two input files.

*For NVT ensemble-*

production md
&cntrl
  imin = 0,
  irest =1,
  iwrap =1,
  ntx = 5,
  ntb = 1,
  cut = 15.0,
  ntr = 0,
  ntc = 2,
  ntf = 2,
  tempi = 300.0,
  temp0 = 300.0,
  ntt = 3,
  gamma_ln = 5.0,
  nstlim = 500000,
  dt = 0.002,
  ntpr = 200,
  ntwr = 10000,
  ntwx = 200
/
*For NPT ensemble*-
equilibration
&cntrl
  imin = 0,
  irest = 1,
  ntx = 5,
  iwrap = 1,
  ntb = 2,
  pres0 = 1.0,
  ntp = 1,
  taup = 2.0,
  cut = 15,
  ntr = 0,
  ntc = 2,
  ntf = 2,
  tempi = 300.0,
  temp0 = 300.0,
  ntt = 3,
  gamma_ln = 5.0,
  nstlim = 500000,
  dt = 0.002,
  ntpr = 200,
  ntwx = 200,
  ntwr = 10000
/

In NVT pmemd.cuda.MPI is ruuning, although I got a warning.

*Warning: Conflicting CPU frequencies detected, using: 2600.000000.Warning:
Conflicting CPU frequencies detected, using: 2600.000000.*
In NPT it stopped after giving some message -

*Warning: Conflicting CPU frequencies detected, using: 2600.000000.Warning:
Conflicting CPU frequencies detected, using: 2600.000000.*

*MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with
errorcode 1.NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI
processes.You may or may not see output from other processes, depending
onexactly when Open MPI kills
them.--------------------------------------------------------------------------|
ERROR: PMEMD does not support intermolecular
PRFs!--------------------------------------------------------------------------mpirun
has exited due to process rank 1 with PID 51030 onnode gpunode2 exiting
improperly. There are two reasons this could occur:1. this process did not
call "init" before exiting, but others inthe job did. This can cause a job
to hang indefinitely while it waitsfor all processes to call "init". By
rule, if one process calls "init",then ALL processes must call "init" prior
to termination.2. this process called "init", but exited without calling
"finalize".By rule, all processes that call "init" MUST call "finalize"
prior toexiting or it will be considered an "abnormal termination"This may
have caused other processes in the application to beterminated by signals
sent by mpirun (as reported
here).--------------------------------------------------------------------------*

*[gpunode2:51028] 1 more process has sent help message help-mpi-api.txt /
mpi-abort[gpunode2:51028] Set MCA parameter "orte_base_help_aggregate" to 0
to see all help / error messages*

And in output file file i am getting the error -
|* ERROR: PMEMD does not support intermolecular PRFs*!

In a another case I have two systems. I want to heat both. In one case I
can do that. But in another I can't. I have used same input file in both
case for heating.

*Warning: Conflicting CPU frequencies detected, using: 2600.000000.Warning:
Conflicting CPU frequencies detected, using: 2600.000000.*

*gpu_download_partial_forces: download failed unspecified launch
failure--------------------------------------------------------------------------mpirun
has exited due to process rank 0 with PID 51268 onnode gpunode2 exiting
improperly. There are two reasons this could occur:1. this process did not
call "init" before exiting, but others inthe job did. This can cause a job
to hang indefinitely while it waitsfor all processes to call "init". By
rule, if one process calls "init",then ALL processes must call "init" prior
to termination.2. this process called "init", but exited without calling
"finalize".By rule, all processes that call "init" MUST call "finalize"
prior toexiting or it will be considered an "abnormal termination"This may
have caused other processes in the application to beterminated by signals
sent by mpirun (as reported here).*

Here I did not get error message -

|* ERROR: PMEMD does not support intermolecular PRFs*!

I don't know what wrong I am doing. So, how can I run pmemd.cuda.MPI
properly in all ensemble without any error and warnng.

Thanking you,

Sanjib

On Tue, May 20, 2014 at 2:28 AM, David A Case <case.biomaps.rutgers.edu>wrote:

> On Mon, May 19, 2014, Sanjib Paul wrote:
> >
> > $AMBERHOME/bin/pmemd.cuda.MPI -O -i mdin -o mdout -p prmtop -c inpcrd -r
> > rst -x mdcrd
>
> You can't just directly run an MPI-enabled code. It's always something
> like:
>
> /path/to/mpirun -np n $AMBERHOME/bin/pmemd.cuda.MPI .....
>
> Since your test cases work, you should just look at how they work, and do
> the same thing. And since the test cases work, you should be in good
> shape,
> but just need to do the same thing that the test cases are doing.
>
> ....dac
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 20 2014 - 06:30:05 PDT