Re: [AMBER] Problems with pmemd.cuda.mpi (again!)

From: Jason Swails <jason.swails.gmail.com>
Date: Sun, 25 Mar 2012 11:24:26 -0400

OK, I think I know what's happening. I'm guessing that you're using
Ubuntu, correct? If so, the default /bin/sh on Ubuntu is actually dash,
not bash (most other OSes use bash).

The problem with test_amber_cuda_parallel.sh (and test_amber_cuda.sh, too,
I think), is that lines 50 and 57 use bash-isms that dash does not
recognize. As a result, it does not set the precision model (which is why
it's looking for pmemd.cuda_.MPI, which doesn't exist, instead of
pmemd.cuda_SPDP.MPI, which does).

The easiest thing to do is to just change the top line of
test_amber_cuda_parallel.sh to read:

#!/bin/bash

instead of

#!/bin/sh

and see if that works.

However, I still expect most of the tests to fail unless you really
recompiled with the MPICH2 compiler wrappers. That doesn't mean turn
mpif90 and mpicc into mpif90.mpich2 and mpicc.mpich2, that means changing
"mpif90" to "/path/to/mpich2/bin/mpif90" and mpicc to
"/path/to/mpich2/bin/mpicc", where /path/to/mpich2/bin is the path that
points to the mpif90 and mpicc compilers in MPICH2.

HTH,
Jason

On Sun, Mar 25, 2012 at 10:34 AM, Adam Jion <adamjion.yahoo.com> wrote:

> Hi,
>
> Thanks for the reply. I did as you suggested but still got the same error.
> I'm still unable to test pmemd.cuda.mpi.
>
> Regards,
> Adam
>
>
> Error Log:
> adam.adam-MS-7750:~/amber11/test$ export DO_PARALLEL='mpirun -np 2'
> adam.adam-MS-7750:~/amber11/test$ make test.cuda.parallel
> (find . -name '*.dif' -o -name 'profile_mpi' | \
> while read dif ;\
> do \
> rm -f $dif ;\
> done ;\
> )
> rm -f TEST_FAILURES.diff
> ./test_amber_cuda_parallel.sh
> [: 50: unexpected operator
> [: 57: unexpected operator
> make[1]: Entering directory `/home/adam/amber11/test'
> cd cuda && make -k test.pmemd.cuda.MPI GPU_ID= PREC_MODEL=
> make[2]: Entering directory `/home/adam/amber11/test/cuda'
> ------------------------------------
> Running CUDA Implicit solvent tests.
> Precision Model =
> GPU_ID =
> ------------------------------------
> cd trpcage/ && ./Run_md_trpcage netcdf.mod
> [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> ./Run_md_trpcage: Program error
> make[2]: *** [test.pmemd.cuda.gb] Error 1
> ------------------------------------
> Running CUDA Explicit solvent tests.
> Precision Model =
> GPU_ID =
> ------------------------------------
> cd 4096wat/ && ./Run.pure_wat netcdf.mod
> [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> ./Run.pure_wat: Program error
> make[2]: *** [test.pmemd.cuda.pme] Error 1
> make[2]: Target `test.pmemd.cuda.MPI' not remade because of errors.
> make[2]: Leaving directory `/home/adam/amber11/test/cuda'
> make[1]: *** [test.pmemd.cuda.MPI] Error 2
> make[1]: Target `test.parallel.cuda' not remade because of errors.
> make[1]: Leaving directory `/home/adam/amber11/test'
> 0 file comparisons passed
> 0 file comparisons failed
> 11 tests experienced errors
> Test log file saved as
> logs/test_amber_cuda_parallel/2012-03-25_22-31-20.log
> No test diffs to save!
>
>
>
>
> ________________________________
> From: Ross Walker <rosscwalker.gmail.com>
> To: Adam Jion <adamjion.yahoo.com>; AMBER Mailing List <amber.ambermd.org>
> Sent: Sunday, March 25, 2012 10:03 PM
> Subject: Re: [AMBER] Problems with pmemd.cuda.mpi (again!)
>
> Hi Adam,
>
> The scripts are not designed to be used directly.
>
> Do
>
> Export DO_PARALLEL='mpirun -np 2'
>
> Make test.cuda.parallel
>
> All the best
> Ross
>
>
>
> On Mar 25, 2012, at 4:51, Adam Jion <adamjion.yahoo.com> wrote:
>
> > Hi all,
> >
> > I'm having a nightmare making Amber11 fully-functional.
> > Anyone able to help??
> >
> >
> > I managed to compile, install and test the parallel version of Amber 11.
> > I'm also able to compile, install and test the serial version of
> Amber11-GPU (i.e. pmemd.cuda)
> >
> > After installing mpich2, I am able to compile and install Amber
> 11-MultiGPU (i.e.pmemd.cuda.mpi)
> > All this was done without tweaking the config.h file.
> >
> > However, for some reasons, I cannot run tests on pmemd.cuda.mpi.
> > Here's the error log:
> >
> > adam.adam-MS-7750:~/amber11/test$ ./test_amber_cuda_parallel.sh
> > [: 50: unexpected operator
> > [: 57: unexpected operator
> > cd cuda && make -k test.pmemd.cuda.MPI GPU_ID= PREC_MODEL=
> > make[1]: Entering directory `/home/adam/amber11/test/cuda'
> > ------------------------------------
> > Running CUDA Implicit solvent tests.
> > Precision Model =
> > GPU_ID =
> > ------------------------------------
> > cd trpcage/ && ./Run_md_trpcage netcdf.mod
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > ./Run_md_trpcage: Program error
> > make[1]: *** [test.pmemd.cuda.gb] Error 1
> > ------------------------------------
> > Running CUDA Explicit solvent tests.
> > Precision Model =
> > GPU_ID =
> > ------------------------------------
> > cd 4096wat/ && ./Run.pure_wat netcdf.mod
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > ./Run.pure_wat: Program error
> > make[1]: *** [test.pmemd.cuda.pme] Error 1
> > make[1]: Target `test.pmemd.cuda.MPI' not remade because of errors.
> > make[1]: Leaving directory `/home/adam/amber11/test/cuda'
> > make: *** [test.pmemd.cuda.MPI] Error 2
> > make: Target `test.parallel.cuda' not remade because of errors.
> > 0 file comparisons passed
> > 0 file comparisons failed
> > 11 tests experienced errors
> > Test log file saved as
> logs/test_amber_cuda_parallel/2012-03-25_19-35-37.log
> > No test diffs to save!
> >
> > Appreciate any help,
> > Adam
> >
> > ps. Using compilers gcc-4.4.6, gfortran-4.4.6, mpicc,mpif90
> > pps. The config.h file is given below:
> >
> > #MODIFIED FOR AMBERTOOLS 1.5
> > # Amber configuration file, created with: ./configure -cuda -mpi gnu
> >
> >
> ###############################################################################
> >
> > # (1) Location of the installation
> >
> > BINDIR=/home/adam/amber11/bin
> > LIBDIR=/home/adam/amber11/lib
> > INCDIR=/home/adam/amber11/include
> > DATDIR=/home/adam/amber11/dat
> >
> >
> ###############################################################################
> >
> >
> > # (2) If you want to search additional libraries by default, add them
> > # to the FLIBS variable here. (External libraries can also be
> linked into
> > # NAB programs simply by including them on the command line;
> libraries
> > # included in FLIBS are always searched.)
> >
> > FLIBS= -L$(LIBDIR) -lsff_mpi -lpbsa $(LIBDIR)/arpack.a
> $(LIBDIR)/lapack.a $(LIBDIR)/blas.a $(LIBDIR)/libnetcdf.a -lgfortran
> > FLIBS_PTRAJ= $(LIBDIR)/arpack.a $(LIBDIR)/lapack.a $(LIBDIR)/blas.a
> -lgfortran
> > FLIBSF= $(LIBDIR)/arpack.a $(LIBDIR)/lapack.a $(LIBDIR)/blas.a
> > FLIBS_FFTW2=-L$(LIBDIR)
> >
> ###############################################################################
> >
> > # (3) Modify any of the following if you need to change, e.g. to use
> gcc
> > # rather than cc, etc.
> >
> > SHELL=/bin/sh
> > INSTALLTYPE=cuda_parallel
> >
> > # Set the C compiler, etc.
> >
> > # For GNU: CC-->gcc; LEX-->flex; YACC-->bison -y -t;
> > # Note: If your lexer is "really" flex, you need to set
> > # LEX=flex below. For example, on many linux distributions,
> > # /usr/bin/lex is really just a pointer to /usr/bin/flex,
> > # so LEX=flex is necessary. In general, gcc seems to need flex.
> >
> > # The compiler flags CFLAGS and CXXFLAGS should always be used.
> > # By contrast, *OPTFLAGS and *NOOPTFLAGS will only be used with
> > # certain files, and usually at compile-time but not link-time.
> > # Where *OPTFLAGS and *NOOPTFLAGS are requested (in Makefiles,
> > # makedepend and depend), they should come before CFLAGS or
> > # CXXFLAGS; this allows the user to override *OPTFLAGS and
> > # *NOOPTFLAGS using the BUILDFLAGS variable.
> > CC=mpicc
> > CFLAGS= -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DBINTRAJ -DMPI
> $(CUSTOMBUILDFLAGS) $(AMBERCFLAGS)
> > OCFLAGS= $(COPTFLAGS) $(AMBERCFLAGS)
> > CNOOPTFLAGS=
> > COPTFLAGS=-O3 -mtune=generic -DBINTRAJ -DHASGZ -DHASBZ2
> > AMBERCFLAGS= $(AMBERBUILDFLAGS)
> >
> > CXX=g++
> > CPLUSPLUS=g++
> > CXXFLAGS= -DMPI $(CUSTOMBUILDFLAGS)
> > CXXNOOPTFLAGS=
> > CXXOPTFLAGS=-O3
> > AMBERCXXFLAGS= $(AMBERBUILDFLAGS)
> >
> > NABFLAGS=
> >
> > LDFLAGS= $(CUSTOMBUILDFLAGS) $(AMBERLDFLAGS)
> > AMBERLDFLAGS=$(AMBERBUILDFLAGS)
> >
> > LEX= flex
> > YACC= $(BINDIR)/yacc
> > AR= ar rv
> > M4= m4
> > RANLIB=ranlib
> >
> > # Set the C-preprocessor. Code for a small preprocessor is in
> > # ucpp-1.3; it gets installed as $(BINDIR)/ucpp;
> > # this can generally be used (maybe not on 64-bit machines like
> altix).
> >
> > CPP= $(BINDIR)/ucpp -l
> >
> > # These variables control whether we will use compiled versions of BLAS
> > # and LAPACK (which are generally slower), or whether those libraries
> are
> > # already available (presumably in an optimized form).
> >
> > LAPACK=install
> > BLAS=install
> > F2C=skip
> >
> > # These variables determine whether builtin versions of certain
> components
> > # can be used, or whether we need to compile our own versions.
> >
> > UCPP=install
> > C9XCOMPLEX=skip
> >
> > # For Windows/cygwin, set SFX to ".exe"; for Unix/Linux leave it empty:
> > # Set OBJSFX to ".obj" instead of ".o" on Windows:
> >
> > SFX=
> > OSFX=.o
> > MV=mv
> > RM=rm
> > CP=cp
> >
> > # Information about Fortran compilation:
> >
> > FC=mpif90
> > FFLAGS= $(LOCALFLAGS) $(CUSTOMBUILDFLAGS) $(FNOOPTFLAGS)
> > FNOOPTFLAGS= -O0
> > FOPTFLAGS= -O3 -mtune=generic $(LOCALFLAGS) $(CUSTOMBUILDFLAGS)
> > AMBERFFLAGS=$(AMBERBUILDFLAGS)
> > FREEFORMAT_FLAG= -ffree-form
> > LM=-lm
> > FPP=cpp -traditional $(FPPFLAGS) $(AMBERFPPFLAGS)
> > FPPFLAGS=-P -DBINTRAJ -DMPI $(CUSTOMBUILDFLAGS)
> > AMBERFPPFLAGS=$(AMBERBUILDFLAGS)
> >
> >
> > BUILD_SLEAP=install_sleap
> > XHOME=
> > XLIBS= -L/lib64 -L/lib
> > MAKE_XLEAP=skip_xleap
> >
> > NETCDF=netcdf.mod
> > NETCDFLIB=$(LIBDIR)/libnetcdf.a
> > PNETCDF=yes
> > PNETCDFLIB=$(LIBDIR)/libpnetcdf.a
> >
> > ZLIB=-lz
> > BZLIB=-lbz2
> >
> > HASFC=yes
> > MDGX=yes
> > CPPTRAJ=yes
> > MTKPP=
> >
> > COMPILER=gnu
> > MKL=
> > MKL_PROCESSOR=
> >
> > #CUDA Specific build flags
> > NVCC=$(CUDA_HOME)/bin/nvcc -use_fast_math -O3 -gencode
> arch=compute_13,code=sm_13 -gencode arch=compute_20,code=sm_20
> > PMEMD_CU_INCLUDES=-I$(CUDA_HOME)/include -IB40C -IB40C/KernelCommon
> -I/usr/include
> > PMEMD_CU_LIBS=-L$(CUDA_HOME)/lib64 -L$(CUDA_HOME)/lib -lcurand -lcufft
> -lcudart ./cuda/cuda.a
> > PMEMD_CU_DEFINES=-DCUDA -DMPI -DMPICH_IGNORE_CXX_SEEK
> >
> > #PMEMD Specific build flags
> > PMEMD_FPP=cpp -traditional -DMPI -P -DBINTRAJ -DDIRFRC_EFS
> -DDIRFRC_COMTRANS -DDIRFRC_NOVEC -DFFTLOADBAL_2PROC -DPUBFFT
> > PMEMD_NETCDFLIB= $(NETCDFLIB)
> > PMEMD_F90=mpif90
> > PMEMD_FOPTFLAGS=-O3 -mtune=generic
> > PMEMD_CC=mpicc
> > PMEMD_COPTFLAGS=-O3 -mtune=generic -DMPICH_IGNORE_CXX_SEEK
> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -DBINTRAJ -DMPI
> > PMEMD_FLIBSF=
> > PMEMD_LD= mpif90
> > LDOUT= -o
> >
> > #3D-RISM MPI
> > RISMSFF=
> > SANDER_RISM_MPI=sander.RISM.MPI$(SFX)
> > TESTRISM=
> >
> > #PUPIL
> > PUPILLIBS=-lrt -lm -lc -L${PUPIL_PATH}/lib -lPUPIL -lPUPILBlind
> >
> > #Python
> > PYINSTALL=
> >
> >
> >
> >
> >
> >
> >
> >
> > adam.adam-MS-7750:~/amber11/test$ ./test_amber_cuda_parallel.sh
> > [: 50: unexpected operator
> > [: 57: unexpected operator
> > cd cuda && make -k test.pmemd.cuda.MPI GPU_ID= PREC_MODEL=
> > make[1]: Entering directory `/home/adam/amber11/test/cuda'
> > ------------------------------------
> > Running CUDA Implicit solvent tests.
> > Precision Model =
> > GPU_ID =
> > ------------------------------------
> > cd trpcage/ && ./Run_md_trpcage netcdf.mod
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > ./Run_md_trpcage: Program error
> > make[1]: *** [test.pmemd.cuda.gb] Error 1
> > ------------------------------------
> > Running CUDA Explicit solvent tests.
> > Precision Model =
> > GPU_ID =
> > ------------------------------------
> > cd 4096wat/ && ./Run.pure_wat netcdf.mod
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > [proxy:0:0.adam-MS-7750] HYDU_create_process
> (./utils/launch/launch.c:69): execvp error on file
> ../../../bin/pmemd.cuda_.MPI (No such file or directory)
> > ./Run.pure_wat: Program error
> > make[1]: *** [test.pmemd.cuda.pme] Error 1
> > make[1]: Target `test.pmemd.cuda.MPI' not remade because of errors.
> > make[1]: Leaving directory `/home/adam/amber11/test/cuda'
> > make: *** [test.pmemd.cuda.MPI] Error 2
> > make: Target `test.parallel.cuda' not remade because of errors.
> > 0 file comparisons passed
> > 0 file comparisons failed
> > 11 tests experienced errors
> > Test log file saved as
> logs/test_amber_cuda_parallel/2012-03-25_19-35-37.log
> > No test diffs to save!
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Mar 25 2012 - 08:30:03 PDT
Custom Search