Re: [AMBER] MPI_ABORT in pH-REMD

From: Jason Swails <jason.swails.gmail.com>
Date: Fri, 18 Mar 2016 11:46:33 -0700

On Fri, Mar 18, 2016 at 6:05 AM, Elisa Pieri <elisa.pieri90.gmail.com>
wrote:

> Dear all,
>
> I was perfectly able to run pH-REMD in explicit solvent, while I have
> problems in Implicit solvent. When I execute the command:
>
> mpirun -n 16 pmemd.MPI -ng 8 -groupfile dio.grpfile
>
> I get this error:
>
> Running multipmemd version of pmemd Amber12
> Total processors = 16
> Number of groups = 8
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
> with errorcode 1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 3 with PID 6576 on
> node agachon exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
> The groupfile has 8 items repeating the unity:
>
> # pH 08
> -O -i ph08.mdin -p 3lzt.parm7 -c 3lzt.equil.rst7 -cpin 3lzt.equil.cpin -o
> 3lzt.ph08.mdout -cpout 3lzt.ph08.cpout -cprestrt 3lzt.ph08.cpin -r
> 3lzt.ph08.rst7 -inf 3lzt.ph08.mdinfo -rem 4 -remlog rem.ph.log -x
> 3lzt.ph08.nc
>
> (of course, the pH changes from item to item). This is (one of) the input:
>
> REM for CpH
> &cntrl
> icnstph=1, dt=0.002, ioutfm=1, ntxo=2,
> nstlim=100, ig=-1, ntb=0, numexchg=10000,
> ntwr=10000, ntwx=1000, irest=1,
> cut=30, ntcnstph=5, ntpr=1000,
> ntx=5, solvph=8, saltcon=0.1, ntt=3,
> ntc=2, ntf=2, gamma_ln=5.0, igb=2,
> tempi=300, temp0=300, nrespa=1,
> tol=0.000001,
> /
>
> I don't understand where this error comes from. Can you help me?
>

​The error message you sent is another one of those "something went wrong"
error messages -- but it gives no details about the "what" went wrong.
Look for error messages in some of the mdout files -- those will be more
informative.

HTH,
Jason

-- 
Jason M. Swails
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 18 2016 - 12:00:03 PDT
Custom Search