Re: [AMBER] Output problem on Amber 9

From: Seren Soner <seren.soner.gmail.com>
Date: Sat, 30 Oct 2010 23:23:41 +0300

That clearly is the reason, but now I decided to go all the way with
OpenMPI, and I can't.

I compiled everything with make serial, then compiled sander with ifort and
openMPI, by modifying the following row;

FC= ifort -I/opt/openmpi/include

The only thing remaining is pmemd, but I can't compile it. I'm trying to
build the config.h using ./configure linux_em64t ifort openmpi
Trying "make" results in;

parallel_dat.f90(92): error #5102: Cannot open include file 'mpif-common.h'
      include 'mpif-common.h'
--------------^
parallel_dat.f90(223): error #6404: This name does not have a type, and must
have an explicit type. [MPI_COMM_WORLD]
    call mpi_abort(mpi_comm_world, i, err_ret_code)
-------------------^
compilation aborted for parallel_dat.f90 (code 1)
make[1]: *** [parallel_dat.o] Error 1

So then I again modify the F90 row to include openmpi headers, which results
in the following error:

ifort -o pmemd gbl_constants.o gbl_datatypes.o state_info.o file_io_dat.o
parallel_dat.o mdin_ctrl_dat.o mdin_ewald_dat.o prmtop_dat.o inpcrd_dat.o
dynamics_dat.o img.o parallel.o pme_direct.o pme_recip.o pme_fft.o fft1d.o
bspline.o pme_force.o pbc.o nb_pairlist.o cit.o dynamics.o bonds.o angles.o
dihedrals.o runmd.o loadbal.o shake.o runmin.o constraints.o axis_optimize.o
gb_ene.o veclib.o gb_force.o timers.o pmemd_lib.o runfiles.o file_io.o
bintraj.o pmemd_clib.o pmemd.o random.o degcnt.o erfcfun.o nmr_calls.o
nmr_lib.o get_cmdline.o master_setup.o alltasks_setup.o pme_setup.o
ene_frc_splines.o nextprmtop_section.o -L/opt/openmpi//lib -pthread
-L/opt/openmpi/lib -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic
-lnsl -lutil -lm -ldl -limf -lsvml
-Wl,-rpath=/export/apps/intel/lib/intel64:/opt/gridengine/lib/lx26-amd64
/export/apps/intel/lib/intel64/libimf.so: warning: warning: feupdateenv is
not implemented and will always fail
parallel_dat.o: In function `parallel_dat_mod_mp_mexit_':
parallel_dat.f90:(.text+0x50): undefined reference to `mpi_group_free_'

And then undefined reference to mpi_* follows with some huge amount of
lines..

Any idea to solve this problem ?

Thanks,
Seren

On Fri, Oct 29, 2010 at 8:25 PM, Jason Swails <jason.swails.gmail.com>wrote:

> On Fri, Oct 29, 2010 at 11:33 AM, Seren Soner <seren.soner.gmail.com>
> wrote:
>
> > Thanks for the quick replies :)
> >
> > But in fact, I have that row in the config.h for pmemd;
> > MPI_DEFINES = -DMPI
> >
> > I had used
> > ./configure -mpich2 ifort_x86_64;
> > make parallel
> >
> > in order to compile Amber.
> >
> > And i have never compiled amber serial, and besides the compilation time
> > for
> > pmemd and pmemd.MPI are exactly equal.
> >
> > Any other ideas ?
> >
>
> The last idea I have: the mpirun/mpiexec you're using is from a different
> MPI implementation than the one that you used to build pmemd. Paste the
> following two things here: the entire config.h you used to build pmemd (if
> you had to make modifications, send the modified version), and the results
> of running the command "which mp****", where mpi**** is the mpirun/mpiexec
> program you used to actually run pmemd in parallel. If you specified a
> specific mpirun using a path, then provide the path of that mpirun.
>
> What I'm guessing here is that your computer has a "system" MPI build (i.e.
> Mac OS X comes pre-installed with a fortran-disabled OpenMPI version), and
> you used your custom MPI build to compile Amber 9, but when you run
> "mpirun"
> without specifying a path, it's using the system-MPI version of mpirun,
> which isn't playing nice with your custom MPI implementation. I've seen
> this kind of behaviour when I've done this exact thing.
>
> Good luck!
> Jason
>
>
> >
> > On Fri, Oct 29, 2010 at 7:28 PM, Jason Swails <jason.swails.gmail.com
> > >wrote:
> >
> > > Ross's observation is exactly what I would guess, as well. However,
> your
> > > email suggests that you're trying to use pmemd, not sander. In that
> > case,
> > > it would appear as though pmemd was compiled without parallel support.
> > > What
> > > does the config.h file in $AMBERHOME/src/pmemd/ look like? Do you see
> a
> > > "-DMPI" in the MPI_DEFINES variable?
> > >
> > > Better yet, do you know the configure command used to configure pmemd?
> > > Also, remember that if you first build pmemd in parallel followed by a
> > > serial build, both executables are named pmemd, so the serial version
> > will
> > > overwrite the parallel version. This has been changed in amber11,
> where
> > > the
> > > default build installs both pmemd and pmemd.MPI, similar to what is
> done
> > > with sander.
> > >
> > > All the best,
> > > Jason
> > >
> > > On Fri, Oct 29, 2010 at 12:21 PM, Ross Walker <ross.rosswalker.co.uk>
> > > wrote:
> > >
> > > > Hi Seren,
> > > >
> > > > It definitely looks to me like you are running the serial version of
> > > sander
> > > > but with mpirun. Make sure you are using a parallel version of the
> > code.
> > > > Make sure the executable given to mpirun is sander.MPI and NOT just
> > > sander.
> > > >
> > > > All the best
> > > > Ross
> > > >
> > > > > -----Original Message-----
> > > > > From: Seren Soner [mailto:seren.soner.gmail.com]
> > > > > Sent: Friday, October 29, 2010 9:10 AM
> > > > > To: AMBER Mailing List
> > > > > Subject: Re: [AMBER] Output problem on Amber 9
> > > > >
> > > > > Thanks for the reply Adrian.
> > > > >
> > > > > But I'm afraid that isnt the case. In fact, I just reinstalled
> AMBER
> > on
> > > a
> > > > > clean and fresh server using MPICH2.
> > > > >
> > > > > The tests all seem just fine, and the run keeps going, but the
> > > coordinate
> > > > > file is corrupt (according to ptraj), and the .out file from the
> > pmemd
> > > is
> > > > as
> > > > > following:
> > > > >
> > > > > NSTEP = 5005 TIME(PS) = 10.010 TEMP(K) = 293
> > > > > NSTEP = 5020 TIME(PS) = 10.040 TEMP(K) = 292.63
> PRESS
> > =
> > > > > 274.9
> > > > > Etot = -206080.1422 EKtot = 50018.8412 EPtot =
> > > > > -256098.9834
> > > > > BOND = 1158.7574 ANGLE = 3226.4692 DIHED =
> > > > > 4959.5549
> > > > > 1-4 NB = 1696.6991 1-4 EEL = 21230.7542 VDWAALS =
> > > > > 32958.6284
> > > > > EELEC = -321329.8466 EHBOND = 0.0000 RESTRAINT =
> > > > > 0.0000
> > > > > EKCMT = 22722.9567 VIRIAL = 17729.8009 VOLUME =
> > > > > 841143.0522
> > > > > Density =
> > > > > 1.0138
> > > > > Ewald error estimate: 0.5244E-04
> > > > > --------------------
> > > > > NSTEP = 5010 TIME(PS) = 10.020 TEMP(K) = 293
> > > > > NSTEP = 5025 TIME(PS) = 10.050 TEMP(K) = 293.83
> PRESS
> > =
> > > > > 220.5
> > > > > Etot = -206067.1820 EKtot = 50223.4622 EPtot =
> > > > > -256290.6442
> > > > > BOND = 1143.5592 ANGLE = 3270.5782 DIHED =
> > > > > 4942.4778
> > > > > 1-4 NB = 1674.3206 1-4 EEL = 21252.6748 VDWAALS =
> > > > > 32893.7176
> > > > > EELEC = -321467.9724 EHBOND = 0.0000 RESTRAINT =
> > > > > 0.0000
> > > > > EKCMT = 22820.5212 VIRIAL = 18815.7936 VOLUME =
> > > > > 841238.7879
> > > > > Density =
> > > > > 1.0137
> > > > > Ewald error estimate: 0.4236E-04
> > > > > --------------------
> > > > > NSTEP = 5015 TIME(PS) = 10.030 TEMP(K) = 292
> > > > > NSTEP = 5030 TIME(PS) = 10.060 TEMP(K) = 293.63
> PRESS
> > =
> > > > > 162.4
> > > > > Etot = -206054.1128 EKtot = 50189.3052 EPtot =
> > > > > -256243.4180
> > > > > BOND = 1095.1010 ANGLE = 3234.5279 DIHED =
> > > > > 4948.3120
> > > > > 1-4 NB = 1664.1808 1-4 EEL = 21271.6600 VDWAALS =
> > > > > 32890.3861
> > > > > EELEC = -321347.5860 EHBOND = 0.0000 RESTRAINT =
> > > > > 0.0000
> > > > > EKCMT = 22884.4044 VIRIAL = 19934.3426 VOLUME =
> > > > > 841311.4130
> > > > > Density =
> > > > > 1.0136
> > > > > Ewald error estimate: 0.6402E-04
> > > > > --------------------
> > > > > NSTEP = 5020 TIME(PS) = 10.040 TEMP(K) = 292
> > > > > NSTEP = 5035 TIME(PS) = 10.070 TEMP(K) = 292.97
> PRESS
> > =
> > > > > 108.3
> > > > > Etot = -206041.0229 EKtot = 50075.8387 EPtot =
> > > > > -256116.8616
> > > > > BOND = 1188.6496 ANGLE = 3253.7780 DIHED =
> > > > > 4958.3474
> > > > > 1-4 NB = 1657.4676 1-4 EEL = 21231.7085 VDWAALS =
> > > > > 32913.2619
> > > > > EELEC = -321320.0746 EHBOND = 0.0000 RESTRAINT =
> > > > > 0.0000
> > > > > EKCMT = 22908.5928 VIRIAL = 20940.4972 VOLUME =
> > > > > 841363.6741
> > > > > Density =
> > > > > 1.0136
> > > > > Ewald error estimate: 0.9973E-04
> > > > > --------------------
> > > > >
> > > > >
> > > > > There seems to be a problem of synchronization, but I don't have
> any
> > > idea
> > > > > what causes it, or what can solve it.
> > > > >
> > > > > Thanks,
> > > > > Seren Soner
> > > > >
> > > > > On Mon, Oct 18, 2010 at 1:40 PM, Adrian Roitberg
> > > > > <roitberg.qtp.ufl.edu>wrote:
> > > > >
> > > > > > I have seen this before, and it is usually due to more than one
> > > > instance
> > > > > > of amber trying to write to the same files. Check that there is
> > only
> > > > one
> > > > > > set of amber runs active in that directory.
> > > > > >
> > > > > > Adrian
> > > > > >
> > > > > >
> > > > > > On 10/18/10 11:16 AM, Seren Soner wrote:
> > > > > > > Hello there,
> > > > > > >
> > > > > > > I've been using AMBER in our server for some time, and after
> the
> > > > > > > server crash a few weeks back, my output files have been acting
> > > > weird.
> > > > > > > My computer administrator tells me nothing has been changed
> after
> > > the
> > > > > > > server crash, and the output file becomes as follows:
> > > > > > >
> > > > > > >
> > > > > > > NSTEP = 219965 TIME(PS) = 12939.930 TEMP(K) = 299
> > > > > > > NSTEP = 220080 TIME(PS) = 12940.160 TEMP(K) = 299.63 PRESS =
> 79.6
> > > > > > > Etot = -203606.8511 EKtot = 51214.7469 EPtot = -254821.5980
> > > > > > > BOND = 1362.7033 ANGLE = 3547.3148 DIHED = 4862.2996
> > > > > > > 1-4 NB = 1683.6634 1-4 EEL = 21484.6416 VDWAALS = 32232.8437
> > > > > > > EELEC = -319995.0644 EHBOND = 0.0000 RESTRAINT = 0.0000
> > > > > > > EKCMT = 23099.5614 VIRIAL = 21648.8530 VOLUME = 844163.5532
> > > > > > >
> > > > > > > Or, sometimes I may see NSTEP = 219000 information at one line,
> > and
> > > > > > > then NSTEP = 218990, NSTEP = 218995, and NSTEP = 219000 again.
> > > > > > >
> > > > > > > Does anyone have any idea what may cause this problem ?
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > AMBER mailing list
> > > > > > > AMBER.ambermd.org
> > > > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Dr. Adrian E. Roitberg
> > > > > > Associate Professor
> > > > > > Quantum Theory Project, Department of Chemistry
> > > > > > University of Florida
> > > > > >
> > > > > > Senior Editor. Journal of Physical Chemistry.
> > > > > >
> > > > > > on Sabbatical in Barcelona until August 2011.
> > > > > > Email roitberg.ufl.edu
> > > > > >
> > > > > > _______________________________________________
> > > > > > AMBER mailing list
> > > > > > AMBER.ambermd.org
> > > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > > >
> > > > > _______________________________________________
> > > > > AMBER mailing list
> > > > > AMBER.ambermd.org
> > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > >
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > >
> > >
> > >
> > >
> > > --
> > > Jason M. Swails
> > > Quantum Theory Project,
> > > University of Florida
> > > Ph.D. Graduate Student
> > > 352-392-4032
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Graduate Student
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Oct 30 2010 - 13:30:05 PDT
Custom Search