Re: [AMBER] MPI_ABORT in pH-REMD

From: Jason Swails <jason.swails.gmail.com>
Date: Mon, 21 Mar 2016 10:59:21 -0400

Try sander.MPI. Does it give the same error?

--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
> On Mar 21, 2016, at 5:46 AM, Elisa Pieri <elisa.pieri90.gmail.com> wrote:
> 
> No, because replicas must be run in multisander/multipmemd mode!
> 
>> On Mon, Mar 21, 2016 at 10:36 AM, Bin ZOU <zoubin1025.gmail.com> wrote:
>> 
>> Hi Elisa,
>> 
>> You can just use pmemd, not the MPI version, to have a try and maybe you
>> can see what is wrong
>> 
>> On Mon, Mar 21, 2016 at 5:08 PM, Elisa Pieri <elisa.pieri90.gmail.com>
>> wrote:
>> 
>>> Ah yes, I forgot to mention it.. the mdout files are empty. Just like the
>>> cpouts and the logfiles.
>>> 
>>> Any idea?
>>> 
>>> Thanks, Elisa
>>> 
>>> On Fri, Mar 18, 2016 at 7:46 PM, Jason Swails <jason.swails.gmail.com>
>>> wrote:
>>> 
>>>> On Fri, Mar 18, 2016 at 6:05 AM, Elisa Pieri <elisa.pieri90.gmail.com>
>>>> wrote:
>>>> 
>>>>> Dear all,
>>>>> 
>>>>> I was perfectly able to run pH-REMD in explicit solvent, while I have
>>>>> problems in Implicit solvent. When I execute the command:
>>>>> 
>>>>> mpirun -n 16 pmemd.MPI -ng 8 -groupfile dio.grpfile
>>>>> 
>>>>> I get this error:
>>>>> 
>>>>> Running multipmemd version of pmemd Amber12
>>>>>    Total processors =    16
>>>>>    Number of groups =     8
>> --------------------------------------------------------------------------
>>>>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>>>>> with errorcode 1.
>>>>> 
>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>> You may or may not see output from other processes, depending on
>>>>> exactly when Open MPI kills them.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>>>> mpirun has exited due to process rank 3 with PID 6576 on
>>>>> node agachon exiting improperly. There are two reasons this could
>>> occur:
>>>>> 
>>>>> 1. this process did not call "init" before exiting, but others in
>>>>> the job did. This can cause a job to hang indefinitely while it waits
>>>>> for all processes to call "init". By rule, if one process calls
>> "init",
>>>>> then ALL processes must call "init" prior to termination.
>>>>> 
>>>>> 2. this process called "init", but exited without calling "finalize".
>>>>> By rule, all processes that call "init" MUST call "finalize" prior to
>>>>> exiting or it will be considered an "abnormal termination"
>>>>> 
>>>>> This may have caused other processes in the application to be
>>>>> terminated by signals sent by mpirun (as reported here).
>>>>> 
>>>>> The groupfile has 8 items repeating the unity:
>>>>> 
>>>>> # pH 08
>>>>> -O -i ph08.mdin -p 3lzt.parm7 -c 3lzt.equil.rst7 -cpin
>> 3lzt.equil.cpin
>>> -o
>>>>> 3lzt.ph08.mdout -cpout 3lzt.ph08.cpout -cprestrt 3lzt.ph08.cpin -r
>>>>> 3lzt.ph08.rst7 -inf 3lzt.ph08.mdinfo -rem 4 -remlog rem.ph.log -x
>>>>> 3lzt.ph08.nc
>>>>> 
>>>>> (of course, the pH changes from item to item). This is (one of) the
>>>> input:
>>>>> 
>>>>> REM for CpH
>>>>> &cntrl
>>>>> icnstph=1, dt=0.002, ioutfm=1, ntxo=2,
>>>>> nstlim=100, ig=-1, ntb=0, numexchg=10000,
>>>>> ntwr=10000, ntwx=1000, irest=1,
>>>>> cut=30, ntcnstph=5, ntpr=1000,
>>>>> ntx=5, solvph=8, saltcon=0.1, ntt=3,
>>>>> ntc=2, ntf=2, gamma_ln=5.0, igb=2,
>>>>> tempi=300, temp0=300, nrespa=1,
>>>>> tol=0.000001,
>>>>> /
>>>>> 
>>>>> I don't understand where this error comes from. Can you help me?
>>>> 
>>>> ​The error message you sent is another one of those "something went
>>> wrong"
>>>> error messages -- but it gives no details about the "what" went wrong.
>>>> Look for error messages in some of the mdout files -- those will be
>> more
>>>> informative.
>>>> 
>>>> HTH,
>>>> Jason
>>>> 
>>>> --
>>>> Jason M. Swails
>>>> _______________________________________________
>>>> AMBER mailing list
>>>> AMBER.ambermd.org
>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>> 
>> 
>> 
>> --
>> Regards,
>> ZOU, Bin
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Mar 21 2016 - 08:00:04 PDT
Custom Search