Re: [AMBER] MPI_ABORT in pH-REMD

From: Bin ZOU <zoubin1025.gmail.com>
Date: Mon, 21 Mar 2016 17:36:25 +0800

Hi Elisa,

You can just use pmemd, not the MPI version, to have a try and maybe you
can see what is wrong

On Mon, Mar 21, 2016 at 5:08 PM, Elisa Pieri <elisa.pieri90.gmail.com>
wrote:

> Ah yes, I forgot to mention it.. the mdout files are empty. Just like the
> cpouts and the logfiles.
>
> Any idea?
>
> Thanks, Elisa
>
> On Fri, Mar 18, 2016 at 7:46 PM, Jason Swails <jason.swails.gmail.com>
> wrote:
>
> > On Fri, Mar 18, 2016 at 6:05 AM, Elisa Pieri <elisa.pieri90.gmail.com>
> > wrote:
> >
> > > Dear all,
> > >
> > > I was perfectly able to run pH-REMD in explicit solvent, while I have
> > > problems in Implicit solvent. When I execute the command:
> > >
> > > mpirun -n 16 pmemd.MPI -ng 8 -groupfile dio.grpfile
> > >
> > > I get this error:
> > >
> > > Running multipmemd version of pmemd Amber12
> > > Total processors = 16
> > > Number of groups = 8
> > >
> > >
> >
> --------------------------------------------------------------------------
> > > MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
> > > with errorcode 1.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > >
> >
> --------------------------------------------------------------------------
> > >
> >
> --------------------------------------------------------------------------
> > > mpirun has exited due to process rank 3 with PID 6576 on
> > > node agachon exiting improperly. There are two reasons this could
> occur:
> > >
> > > 1. this process did not call "init" before exiting, but others in
> > > the job did. This can cause a job to hang indefinitely while it waits
> > > for all processes to call "init". By rule, if one process calls "init",
> > > then ALL processes must call "init" prior to termination.
> > >
> > > 2. this process called "init", but exited without calling "finalize".
> > > By rule, all processes that call "init" MUST call "finalize" prior to
> > > exiting or it will be considered an "abnormal termination"
> > >
> > > This may have caused other processes in the application to be
> > > terminated by signals sent by mpirun (as reported here).
> > >
> > > The groupfile has 8 items repeating the unity:
> > >
> > > # pH 08
> > > -O -i ph08.mdin -p 3lzt.parm7 -c 3lzt.equil.rst7 -cpin 3lzt.equil.cpin
> -o
> > > 3lzt.ph08.mdout -cpout 3lzt.ph08.cpout -cprestrt 3lzt.ph08.cpin -r
> > > 3lzt.ph08.rst7 -inf 3lzt.ph08.mdinfo -rem 4 -remlog rem.ph.log -x
> > > 3lzt.ph08.nc
> > >
> > > (of course, the pH changes from item to item). This is (one of) the
> > input:
> > >
> > > REM for CpH
> > > &cntrl
> > > icnstph=1, dt=0.002, ioutfm=1, ntxo=2,
> > > nstlim=100, ig=-1, ntb=0, numexchg=10000,
> > > ntwr=10000, ntwx=1000, irest=1,
> > > cut=30, ntcnstph=5, ntpr=1000,
> > > ntx=5, solvph=8, saltcon=0.1, ntt=3,
> > > ntc=2, ntf=2, gamma_ln=5.0, igb=2,
> > > tempi=300, temp0=300, nrespa=1,
> > > tol=0.000001,
> > > /
> > >
> > > I don't understand where this error comes from. Can you help me?
> > >
> >
> > ​The error message you sent is another one of those "something went
> wrong"
> > error messages -- but it gives no details about the "what" went wrong.
> > Look for error messages in some of the mdout files -- those will be more
> > informative.
> >
> > HTH,
> > Jason
> >
> > --
> > Jason M. Swails
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Regards,
ZOU, Bin
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Mar 21 2016 - 03:00:03 PDT
Custom Search