Re: [AMBER] amber10 pmemd fail

From: Robert Duke <rduke.email.unc.edu>
Date: Fri, 11 Dec 2009 16:40:31 -0500

This could be matters of the software or hardware install for mpi; pmemd is
not the issue, other than that you could be linking to the wrong things when
building it. You all need to clearly understand your mpi
implementation/installation issues, make sure you have the proper libraries
built to link pmemd against, get the config file right, and then move
forward with testing. You probably need to run the mpi test suites first -
part of your mpi distribution. It is really hard for someone who does not
have complete access to your machines to guess correctly on everything.
There is an extensive README in the pmemd directory on a lot of the
configuration issues. There are also links to a lot of useful discussion on
the whole problem of getting mpi up and running on the ambermd.org website.
If I am underestimating where you all are at in this process, I apologize.
The messages you see here I tend to associate with loose cables or dead
hardware or mpi misconfiguration (more the first two than the latter if
memory serves), but I don't routinely administer anything other than the
pile of clustered workstations I use for dev and test. If you post your
exact h/w and s/w config info, there may be others on this mailing list that
can help you.
Regards - Bob Duke
----- Original Message -----
From: "Aragorn Steiger" <aragorn.wayne.edu>
To: <amber.ambermd.org>
Sent: Friday, December 11, 2009 3:35 PM
Subject: [AMBER] amber10 pmemd fail


>
> Hello everyone,
>
> I would like a bit of advice. I have compiled pmemd for a researcher here
> at Wayne State. When we run it it dies at varying points. We get various
> errors from mpirun such as:
>
> p8_15722: (1054.988281) net_recv failed for fd = 9
> p8_15722: p4_error: net_recv read, errno = : 110
> p14_21538: p4_error: Found a dead connection while looking for messages: 9
>
> I am using various libraries from MKL, ifort, and mvapich. The config file
> that I am using is below:
>
> IFORT_RPATH =
> /wsu/arch/amd64/compilers/intel/ifort-11.1.056/lib/intel64:/wsu/arch/amd64/lib/l_mkl_p_10.2.2.025/lib/em64t:/wsu/arch/amd64/mvapich-1.1/lib:/wsu/arch/amd64/lib/l_mkl_p_10.2.2.025:/usr/local/lib:/u
> sr/local/lib:/wsu/usr/local/lib:/usr/local/pgsql/lib
> MATH_DEFINES = -DMKL
> MATH_LIBS =
> /wsu/arch/amd64/intel/mkl/10.0.1.014/lib/em64t/libmkl_em64t.a -L/wsu/arch/amd64/lib/l_mkl_p_10.2.2.025/lib/em64t
> -lguide -lpthread
> FFT_DEFINES = -DPUBFFT
> FFT_INCLUDE =
> FFT_LIBS =
> NETCDF_HOME =
> NETCDF_DEFINES =
> NETCDF_MOD =
> NETCDF_LIBS =
> MPI_HOME = /wsu/arch/amd64/mvapich-1.1
> MPI_LIBDIR2 = /wsu/lib64
> MPI_DEFINES = -DMPI
> MPI_INCLUDE = -I$(MPI_HOME)/include
> MPI_LIBDIR = $(MPI_HOME)/lib
> MPI_LIBS = -L$(MPI_LIBDIR) -lmpich /usr/lib64/libibverbs.so.1 -lpthread
> DIRFRC_DEFINES = -DDIRFRC_EFS -DDIRFRC_COMTRANS -DDIRFRC_NOVEC
> CPP = /lib/cpp
> CPPFLAGS = -traditional -P
> F90_DEFINES = -DFFTLOADBAL_2PROC
>
> F90 = ifort
> MODULE_SUFFIX = mod
> F90FLAGS = -c -auto
> F90_OPT_DBG = -g -traceback
> F90_OPT_LO = -O0
> F90_OPT_MED = -O2
> F90_OPT_HI = -xP -ip -O3
> F90_OPT_DFLT = $(F90_OPT_HI)
>
> CC = gcc
> CFLAGS =
>
> LOAD = ifort
> LOADFLAGS =
> LOADLIBS = -limf -lsvml -Wl,-rpath=$(IFORT_RPATH)
>
> I am wondering if someone could suggest a course of action?
>
> Thank You,
>
> Aragorn Steiger
> Senior Systems Software Engineer
> Wayne State University
> Computing & Information Technology
> Scientific Computing
> (313) 577-9601
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Dec 11 2009 - 14:00:02 PST
Custom Search