Re: [AMBER] MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1

From: Carlos Simmerling via AMBER <amber.ambermd.org>
Date: Mon, 24 Jun 2024 11:21:03 -0400

this seems different than what you sent before.
As I mentioned, please look carefully at the manual. For example, your
sander.MPI is missing the -ng flag.
From the manual:

  Below is an example of an 8-replica REMD run on 16 processors, (note that
launching a MPI program varies from computer to computer). mpirun -np 16
sander.MPI -ng 8 -groupfile groupfile -rem 1



On Mon, Jun 24, 2024 at 11:16 AM MIRA JHAWAR <j.mira.iitg.ac.in> wrote:

> This is the groupfile :
> -O -rem 1 -remlog rem.log -i remd.mdin.001 -o remd.mdout.001 -c
> equilibrate.rst.001
> -r remd.rst.001 -x remd.mdcrd.001 -inf remd.mdinfo.001 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.002 -o remd.mdout.002 -c
> equilibrate.rst.002
> -r remd.rst.002 -x remd.mdcrd.002 -inf remd.mdinfo.002 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.003 -o remd.mdout.003 -c
> equilibrate.rst.003
> -r remd.rst.003 -x remd.mdcrd.003 -inf remd.mdinfo.003 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.004 -o remd.mdout.004 -c
> equilibrate.rst.004
> -r remd.rst.004 -x remd.mdcrd.004 -inf remd.mdinfo.004 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.005 -o remd.mdout.005 -c
> equilibrate.rst.005
> -r remd.rst.005 -x remd.mdcrd.005 -inf remd.mdinfo.005 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.006 -o remd.mdout.006 -c
> equilibrate.rst.006
> -r remd.rst.006 -x remd.mdcrd.006 -inf remd.mdinfo.006 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.007 -o remd.mdout.007 -c
> equilibrate.rst.007
> -r remd.rst.007 -x remd.mdcrd.007 -inf remd.mdinfo.007 -p tleap.prmtop
> -O -rem 1 -remlog rem.log -i remd.mdin.008 -o remd.mdout.008 -c
> equilibrate.rst.008
> -r remd.rst.008 -x remd.mdcrd.008 -inf remd.mdinfo.008 -p tleap.prmtop
>
> I used the following command:
> mpirun -np 80 sander.MPI -groupfile remd.groupfile
>
> Regards,
> Mira Jhawar
> Research Scholar
> Roll no-206122118
> Department of Chemistry
>
> ------------------------------
> *From:* Carlos Simmerling <carlos.simmerling.stonybrook.edu>
> *Sent:* Monday, June 24, 2024 8:39 PM
> *To:* MIRA JHAWAR <j.mira.iitg.ac.in>
> *Cc:* Carlos Simmerling <carlos.simmerling.stonybrook.edu>; AMBER Mailing
> List <amber.ambermd.org>
> *Subject:* Re: [AMBER] MPI_ABORT was invoked on rank 0 in communicator
> MPI_COMM_WORLD with errorcode 1
>
> You don't often get email from carlos.simmerling.stonybrook.edu. Learn
> why this is important <https://aka.ms/LearnAboutSenderIdentification>
> please follow the directions from that section of the manual. I don't know
> what you mean by REMD.sh.
> If there are still problems, include the command or script that points to
> the groupfile as well as the groupfile itself.
>
> On Mon, Jun 24, 2024 at 11:07 AM MIRA JHAWAR <j.mira.iitg.ac.in> wrote:
>
> I have used amber20. Yes there is a groupfile generated but I used that
> groupfile as a script like REMD.sh
>
> Get Outlook for Android <https://aka.ms/AAb9ysg>
> ------------------------------
> *From:* Carlos Simmerling <carlos.simmerling.stonybrook.edu>
> *Sent:* Monday, June 24, 2024 8:33:57 PM
> *To:* MIRA JHAWAR <j.mira.iitg.ac.in>; AMBER Mailing List <
> amber.ambermd.org>
> *Subject:* Re: [AMBER] MPI_ABORT was invoked on rank 0 in communicator
> MPI_COMM_WORLD with errorcode 1
>
> You don't often get email from carlos.simmerling.stonybrook.edu. Learn
> why this is important <https://aka.ms/LearnAboutSenderIdentification>
> you don't say which Amber version you used, but see for example
> section 25.3.4 for the Amber 24 manual. A groupfile is required for REMD.
>
>
> On Mon, Jun 24, 2024 at 10:58 AM MIRA JHAWAR via AMBER <amber.ambermd.org>
> wrote:
>
> No there was no error message in the mdout file
>
>
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> ________________________________
> From: Daniel Roe <daniel.r.roe.gmail.com>
> Sent: Monday, June 24, 2024 7:33:24 PM
> To: MIRA JHAWAR <j.mira.iitg.ac.in>; AMBER Mailing List <amber.ambermd.org
> >
> Subject: Re: [AMBER] MPI_ABORT was invoked on rank 0 in communicator
> MPI_COMM_WORLD with errorcode 1
>
> Hi,
>
> Were there any other error messages? Was anything written to your MDOUT
> file?
>
> -Dan
>
> On Mon, Jun 24, 2024 at 10:00 AM MIRA JHAWAR via AMBER
> <amber.ambermd.org> wrote:
> >
> > Dear all,
> > I am trying to run REMD in NVT with the following input file using
> sander.MPI:
> >
> >
> > Equilibration
> > &cntrl
> > imin=0,
> > irest=1, ntx=5,
> > nstlim=100000, dt=0.002,
> > ntt=3, gamma_ln=1.0,
> > tempi=XXXXX,
> > temp0=XXXXX, ig=RANDOM_NUMBER,
> > ntc=2, ntf=2, nscm=1000,
> > ntb=1,
> > cut=10.0, rgbmax=999.0,
> > ntpr=1000, ntwx=1000, ntwr=100000,
> > nmropt=1,
> > numexchg=1000,
> > /
> > &wt TYPE='END'
> > /
> > DISANG=work_chir.dat
> >
> > But the program is terminating with the following error:
> >
> >
> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > with errorcode 1.
> >
> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > You may or may not see output from other processes, depending on
> > exactly when Open MPI kills them.
> >
> > If anyone knows how to solve it please reply.
> >
> > Regards,
> > Mira Jhawar
> > Research Scholar
> > Department of Chemistry
> > IIT Guwahati
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jun 24 2024 - 08:30:02 PDT
Custom Search