Re: [AMBER] Output problem on Amber 9

From: Jason Swails <jason.swails.gmail.com>
Date: Fri, 29 Oct 2010 12:25:08 -0500

On Fri, Oct 29, 2010 at 11:33 AM, Seren Soner <seren.soner.gmail.com> wrote:

> Thanks for the quick replies :)
>
> But in fact, I have that row in the config.h for pmemd;
> MPI_DEFINES = -DMPI
>
> I had used
> ./configure -mpich2 ifort_x86_64;
> make parallel
>
> in order to compile Amber.
>
> And i have never compiled amber serial, and besides the compilation time
> for
> pmemd and pmemd.MPI are exactly equal.
>
> Any other ideas ?
>

The last idea I have: the mpirun/mpiexec you're using is from a different
MPI implementation than the one that you used to build pmemd. Paste the
following two things here: the entire config.h you used to build pmemd (if
you had to make modifications, send the modified version), and the results
of running the command "which mp****", where mpi**** is the mpirun/mpiexec
program you used to actually run pmemd in parallel. If you specified a
specific mpirun using a path, then provide the path of that mpirun.

What I'm guessing here is that your computer has a "system" MPI build (i.e.
Mac OS X comes pre-installed with a fortran-disabled OpenMPI version), and
you used your custom MPI build to compile Amber 9, but when you run "mpirun"
without specifying a path, it's using the system-MPI version of mpirun,
which isn't playing nice with your custom MPI implementation. I've seen
this kind of behaviour when I've done this exact thing.

Good luck!
Jason


>
> On Fri, Oct 29, 2010 at 7:28 PM, Jason Swails <jason.swails.gmail.com
> >wrote:
>
> > Ross's observation is exactly what I would guess, as well. However, your
> > email suggests that you're trying to use pmemd, not sander. In that
> case,
> > it would appear as though pmemd was compiled without parallel support.
> > What
> > does the config.h file in $AMBERHOME/src/pmemd/ look like? Do you see a
> > "-DMPI" in the MPI_DEFINES variable?
> >
> > Better yet, do you know the configure command used to configure pmemd?
> > Also, remember that if you first build pmemd in parallel followed by a
> > serial build, both executables are named pmemd, so the serial version
> will
> > overwrite the parallel version. This has been changed in amber11, where
> > the
> > default build installs both pmemd and pmemd.MPI, similar to what is done
> > with sander.
> >
> > All the best,
> > Jason
> >
> > On Fri, Oct 29, 2010 at 12:21 PM, Ross Walker <ross.rosswalker.co.uk>
> > wrote:
> >
> > > Hi Seren,
> > >
> > > It definitely looks to me like you are running the serial version of
> > sander
> > > but with mpirun. Make sure you are using a parallel version of the
> code.
> > > Make sure the executable given to mpirun is sander.MPI and NOT just
> > sander.
> > >
> > > All the best
> > > Ross
> > >
> > > > -----Original Message-----
> > > > From: Seren Soner [mailto:seren.soner.gmail.com]
> > > > Sent: Friday, October 29, 2010 9:10 AM
> > > > To: AMBER Mailing List
> > > > Subject: Re: [AMBER] Output problem on Amber 9
> > > >
> > > > Thanks for the reply Adrian.
> > > >
> > > > But I'm afraid that isnt the case. In fact, I just reinstalled AMBER
> on
> > a
> > > > clean and fresh server using MPICH2.
> > > >
> > > > The tests all seem just fine, and the run keeps going, but the
> > coordinate
> > > > file is corrupt (according to ptraj), and the .out file from the
> pmemd
> > is
> > > as
> > > > following:
> > > >
> > > > NSTEP = 5005 TIME(PS) = 10.010 TEMP(K) = 293
> > > > NSTEP = 5020 TIME(PS) = 10.040 TEMP(K) = 292.63 PRESS
> =
> > > > 274.9
> > > > Etot = -206080.1422 EKtot = 50018.8412 EPtot =
> > > > -256098.9834
> > > > BOND = 1158.7574 ANGLE = 3226.4692 DIHED =
> > > > 4959.5549
> > > > 1-4 NB = 1696.6991 1-4 EEL = 21230.7542 VDWAALS =
> > > > 32958.6284
> > > > EELEC = -321329.8466 EHBOND = 0.0000 RESTRAINT =
> > > > 0.0000
> > > > EKCMT = 22722.9567 VIRIAL = 17729.8009 VOLUME =
> > > > 841143.0522
> > > > Density =
> > > > 1.0138
> > > > Ewald error estimate: 0.5244E-04
> > > > --------------------
> > > > NSTEP = 5010 TIME(PS) = 10.020 TEMP(K) = 293
> > > > NSTEP = 5025 TIME(PS) = 10.050 TEMP(K) = 293.83 PRESS
> =
> > > > 220.5
> > > > Etot = -206067.1820 EKtot = 50223.4622 EPtot =
> > > > -256290.6442
> > > > BOND = 1143.5592 ANGLE = 3270.5782 DIHED =
> > > > 4942.4778
> > > > 1-4 NB = 1674.3206 1-4 EEL = 21252.6748 VDWAALS =
> > > > 32893.7176
> > > > EELEC = -321467.9724 EHBOND = 0.0000 RESTRAINT =
> > > > 0.0000
> > > > EKCMT = 22820.5212 VIRIAL = 18815.7936 VOLUME =
> > > > 841238.7879
> > > > Density =
> > > > 1.0137
> > > > Ewald error estimate: 0.4236E-04
> > > > --------------------
> > > > NSTEP = 5015 TIME(PS) = 10.030 TEMP(K) = 292
> > > > NSTEP = 5030 TIME(PS) = 10.060 TEMP(K) = 293.63 PRESS
> =
> > > > 162.4
> > > > Etot = -206054.1128 EKtot = 50189.3052 EPtot =
> > > > -256243.4180
> > > > BOND = 1095.1010 ANGLE = 3234.5279 DIHED =
> > > > 4948.3120
> > > > 1-4 NB = 1664.1808 1-4 EEL = 21271.6600 VDWAALS =
> > > > 32890.3861
> > > > EELEC = -321347.5860 EHBOND = 0.0000 RESTRAINT =
> > > > 0.0000
> > > > EKCMT = 22884.4044 VIRIAL = 19934.3426 VOLUME =
> > > > 841311.4130
> > > > Density =
> > > > 1.0136
> > > > Ewald error estimate: 0.6402E-04
> > > > --------------------
> > > > NSTEP = 5020 TIME(PS) = 10.040 TEMP(K) = 292
> > > > NSTEP = 5035 TIME(PS) = 10.070 TEMP(K) = 292.97 PRESS
> =
> > > > 108.3
> > > > Etot = -206041.0229 EKtot = 50075.8387 EPtot =
> > > > -256116.8616
> > > > BOND = 1188.6496 ANGLE = 3253.7780 DIHED =
> > > > 4958.3474
> > > > 1-4 NB = 1657.4676 1-4 EEL = 21231.7085 VDWAALS =
> > > > 32913.2619
> > > > EELEC = -321320.0746 EHBOND = 0.0000 RESTRAINT =
> > > > 0.0000
> > > > EKCMT = 22908.5928 VIRIAL = 20940.4972 VOLUME =
> > > > 841363.6741
> > > > Density =
> > > > 1.0136
> > > > Ewald error estimate: 0.9973E-04
> > > > --------------------
> > > >
> > > >
> > > > There seems to be a problem of synchronization, but I don't have any
> > idea
> > > > what causes it, or what can solve it.
> > > >
> > > > Thanks,
> > > > Seren Soner
> > > >
> > > > On Mon, Oct 18, 2010 at 1:40 PM, Adrian Roitberg
> > > > <roitberg.qtp.ufl.edu>wrote:
> > > >
> > > > > I have seen this before, and it is usually due to more than one
> > > instance
> > > > > of amber trying to write to the same files. Check that there is
> only
> > > one
> > > > > set of amber runs active in that directory.
> > > > >
> > > > > Adrian
> > > > >
> > > > >
> > > > > On 10/18/10 11:16 AM, Seren Soner wrote:
> > > > > > Hello there,
> > > > > >
> > > > > > I've been using AMBER in our server for some time, and after the
> > > > > > server crash a few weeks back, my output files have been acting
> > > weird.
> > > > > > My computer administrator tells me nothing has been changed after
> > the
> > > > > > server crash, and the output file becomes as follows:
> > > > > >
> > > > > >
> > > > > > NSTEP = 219965 TIME(PS) = 12939.930 TEMP(K) = 299
> > > > > > NSTEP = 220080 TIME(PS) = 12940.160 TEMP(K) = 299.63 PRESS = 79.6
> > > > > > Etot = -203606.8511 EKtot = 51214.7469 EPtot = -254821.5980
> > > > > > BOND = 1362.7033 ANGLE = 3547.3148 DIHED = 4862.2996
> > > > > > 1-4 NB = 1683.6634 1-4 EEL = 21484.6416 VDWAALS = 32232.8437
> > > > > > EELEC = -319995.0644 EHBOND = 0.0000 RESTRAINT = 0.0000
> > > > > > EKCMT = 23099.5614 VIRIAL = 21648.8530 VOLUME = 844163.5532
> > > > > >
> > > > > > Or, sometimes I may see NSTEP = 219000 information at one line,
> and
> > > > > > then NSTEP = 218990, NSTEP = 218995, and NSTEP = 219000 again.
> > > > > >
> > > > > > Does anyone have any idea what may cause this problem ?
> > > > > >
> > > > > > _______________________________________________
> > > > > > AMBER mailing list
> > > > > > AMBER.ambermd.org
> > > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > > >
> > > > >
> > > > > --
> > > > > Dr. Adrian E. Roitberg
> > > > > Associate Professor
> > > > > Quantum Theory Project, Department of Chemistry
> > > > > University of Florida
> > > > >
> > > > > Senior Editor. Journal of Physical Chemistry.
> > > > >
> > > > > on Sabbatical in Barcelona until August 2011.
> > > > > Email roitberg.ufl.edu
> > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing list
> > > > > AMBER.ambermd.org
> > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > >
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> >
> >
> >
> > --
> > Jason M. Swails
> > Quantum Theory Project,
> > University of Florida
> > Ph.D. Graduate Student
> > 352-392-4032
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Graduate Student
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 29 2010 - 10:30:06 PDT
Custom Search