Re: [AMBER] MPI_ABORT in pH-REMD

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Mon, 21 Mar 2016 09:23:52 -0600

On Mon, Mar 21, 2016 at 9:13 AM, Elisa Pieri <elisa.pieri90.gmail.com> wrote:
> In this case the error is:
>
> Running multisander version of sander Amber14
> Total processors = 16
> Number of groups = 8
>
>
> Coordinate resetting (SHAKE) cannot be accomplished,
> deviation is too large
> NITER, NIT, LL, I and J are : 0 0 749 1497 1498
>
> So there is a SHAKE problem?

SHAKE failures are usually related to inadequate
minimization/equilibration of your system (there are many, many past
posts to the mailing list on this topic). You may need to run
additional minimization/equilibration on your starting structure(s).

-Dan

>
> Elisa
>
> On Mon, Mar 21, 2016 at 3:59 PM, Jason Swails <jason.swails.gmail.com>
> wrote:
>
>> Try sander.MPI. Does it give the same error?
>>
>> --
>> Jason M. Swails
>> BioMaPS,
>> Rutgers University
>> Postdoctoral Researcher
>>
>> > On Mar 21, 2016, at 5:46 AM, Elisa Pieri <elisa.pieri90.gmail.com>
>> wrote:
>> >
>> > No, because replicas must be run in multisander/multipmemd mode!
>> >
>> >> On Mon, Mar 21, 2016 at 10:36 AM, Bin ZOU <zoubin1025.gmail.com> wrote:
>> >>
>> >> Hi Elisa,
>> >>
>> >> You can just use pmemd, not the MPI version, to have a try and maybe you
>> >> can see what is wrong
>> >>
>> >> On Mon, Mar 21, 2016 at 5:08 PM, Elisa Pieri <elisa.pieri90.gmail.com>
>> >> wrote:
>> >>
>> >>> Ah yes, I forgot to mention it.. the mdout files are empty. Just like
>> the
>> >>> cpouts and the logfiles.
>> >>>
>> >>> Any idea?
>> >>>
>> >>> Thanks, Elisa
>> >>>
>> >>> On Fri, Mar 18, 2016 at 7:46 PM, Jason Swails <jason.swails.gmail.com>
>> >>> wrote:
>> >>>
>> >>>> On Fri, Mar 18, 2016 at 6:05 AM, Elisa Pieri <elisa.pieri90.gmail.com
>> >
>> >>>> wrote:
>> >>>>
>> >>>>> Dear all,
>> >>>>>
>> >>>>> I was perfectly able to run pH-REMD in explicit solvent, while I have
>> >>>>> problems in Implicit solvent. When I execute the command:
>> >>>>>
>> >>>>> mpirun -n 16 pmemd.MPI -ng 8 -groupfile dio.grpfile
>> >>>>>
>> >>>>> I get this error:
>> >>>>>
>> >>>>> Running multipmemd version of pmemd Amber12
>> >>>>> Total processors = 16
>> >>>>> Number of groups = 8
>> >>
>> --------------------------------------------------------------------------
>> >>>>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>> >>>>> with errorcode 1.
>> >>>>>
>> >>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> >>>>> You may or may not see output from other processes, depending on
>> >>>>> exactly when Open MPI kills them.
>> >>
>> --------------------------------------------------------------------------
>> >>
>> --------------------------------------------------------------------------
>> >>>>> mpirun has exited due to process rank 3 with PID 6576 on
>> >>>>> node agachon exiting improperly. There are two reasons this could
>> >>> occur:
>> >>>>>
>> >>>>> 1. this process did not call "init" before exiting, but others in
>> >>>>> the job did. This can cause a job to hang indefinitely while it waits
>> >>>>> for all processes to call "init". By rule, if one process calls
>> >> "init",
>> >>>>> then ALL processes must call "init" prior to termination.
>> >>>>>
>> >>>>> 2. this process called "init", but exited without calling "finalize".
>> >>>>> By rule, all processes that call "init" MUST call "finalize" prior to
>> >>>>> exiting or it will be considered an "abnormal termination"
>> >>>>>
>> >>>>> This may have caused other processes in the application to be
>> >>>>> terminated by signals sent by mpirun (as reported here).
>> >>>>>
>> >>>>> The groupfile has 8 items repeating the unity:
>> >>>>>
>> >>>>> # pH 08
>> >>>>> -O -i ph08.mdin -p 3lzt.parm7 -c 3lzt.equil.rst7 -cpin
>> >> 3lzt.equil.cpin
>> >>> -o
>> >>>>> 3lzt.ph08.mdout -cpout 3lzt.ph08.cpout -cprestrt 3lzt.ph08.cpin -r
>> >>>>> 3lzt.ph08.rst7 -inf 3lzt.ph08.mdinfo -rem 4 -remlog rem.ph.log -x
>> >>>>> 3lzt.ph08.nc
>> >>>>>
>> >>>>> (of course, the pH changes from item to item). This is (one of) the
>> >>>> input:
>> >>>>>
>> >>>>> REM for CpH
>> >>>>> &cntrl
>> >>>>> icnstph=1, dt=0.002, ioutfm=1, ntxo=2,
>> >>>>> nstlim=100, ig=-1, ntb=0, numexchg=10000,
>> >>>>> ntwr=10000, ntwx=1000, irest=1,
>> >>>>> cut=30, ntcnstph=5, ntpr=1000,
>> >>>>> ntx=5, solvph=8, saltcon=0.1, ntt=3,
>> >>>>> ntc=2, ntf=2, gamma_ln=5.0, igb=2,
>> >>>>> tempi=300, temp0=300, nrespa=1,
>> >>>>> tol=0.000001,
>> >>>>> /
>> >>>>>
>> >>>>> I don't understand where this error comes from. Can you help me?
>> >>>>
>> >>>> The error message you sent is another one of those "something went
>> >>> wrong"
>> >>>> error messages -- but it gives no details about the "what" went wrong.
>> >>>> Look for error messages in some of the mdout files -- those will be
>> >> more
>> >>>> informative.
>> >>>>
>> >>>> HTH,
>> >>>> Jason
>> >>>>
>> >>>> --
>> >>>> Jason M. Swails
>> >>>> _______________________________________________
>> >>>> AMBER mailing list
>> >>>> AMBER.ambermd.org
>> >>>> http://lists.ambermd.org/mailman/listinfo/amber
>> >>> _______________________________________________
>> >>> AMBER mailing list
>> >>> AMBER.ambermd.org
>> >>> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >> ZOU, Bin
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Mar 21 2016 - 08:30:06 PDT
Custom Search