Re: [AMBER] Error Compiling pmemd.cuda.MPI (i.e. Multiple GPUs)

From: Jason Swails <jason.swails.gmail.com>
Date: Sat, 24 Mar 2012 22:33:37 -0400

This depends on your shell -- these instructions should be in the Amber and
AmberTools manuals where they mention this. For sh/bash, it's:

export DO_PARALLEL='mpirun -np 2'

For tcsh/csh, it's:

setenv DO_PARALLEL "mpirun -np 2"

However, if you just changed the -I include path in the config.h without
changing the mpif90/mpicc compilers used to build the rest of pmemd, I
would be surprised if it passes any tests. To do this, you need to make
sure that mpif90 and mpicc point to the MPICH2 versions of them in the
config.h, then do a "make clean" and a fresh "make cuda_parallel"

HTH,
Jason

On Sat, Mar 24, 2012 at 8:45 PM, Adam Jion <adamjion.yahoo.com> wrote:

> Thanks Jason, You're right again!
> When I used mpich2 directory, I was able install pmemd.cuda.MPI.
> However, now I have problems testing pmemd.cuda.MPI.
> Here's my command line and error message:
>
> adam.adam-MS-7750:~/amber11/test$ ./test_amber_cuda_parallel.sh
> Error: DO_PARALLEL is not set! Set DO_PARALLEL and re-run the tests.
>
> How do I set DO_PARALLEL?
>
> As always, appreciative of your help :-)
> Adam
>
>
> ------------------------------
> *From:* Jason Swails <jason.swails.gmail.com>
> *To:* AMBER Mailing List <amber.ambermd.org>
> *Sent:* Sunday, March 25, 2012 8:15 AM
>
> *Subject:* Re: [AMBER] Error Compiling pmemd.cuda.MPI (i.e. Multiple GPUs)
>
> I'm not sure how my original post didn't reply to amber.
>
> What version of MPI is the one you're using? I'm guessing it's OpenMPI
> (something like 1.4.x). In that case, pmemd.cuda.MPI requires a version
> with full MPI-2.0 support (e.g. mpich2, the latest versions of OpenMPI).
>
> So the answer is you'll have to change your MPI to something that supports
> pmemd.cuda.MPI.
>
> HTH,
> Jason
>
> On Sat, Mar 24, 2012 at 8:12 PM, Adam Jion <adamjion.yahoo.com> wrote:
>
> > Ok, I managed to point to the correct mpi.h file in config.h.
> > But a new problem has arisen.
> > Now I get the following error (related to undefined references):
> >
> > make[3]: Leaving directory `/home/adam/amber11/src/pmemd/src/cuda'
> > mpif90 -O3 -mtune=generic -DCUDA -DMPI -DMPICH_IGNORE_CXX_SEEK -o
> > pmemd.cuda.MPI gbl_constants.o gbl_datatypes.o state_info.o file_io_dat.o
> > mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o prmtop_dat.o
> > inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o gb_parallel.o
> > pme_direct.o pme_recip_dat.o pme_slab_recip.o pme_blk_recip.o
> > pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o bspline.o pme_force.o
> > pbc.o nb_pairlist.o nb_exclusions.o cit.o dynamics.o bonds.o angles.o
> > dihedrals.o extra_pnts_nb14.o runmd.o loadbal.o shake.o prfs.o mol_list.o
> > runmin.o constraints.o axis_optimize.o gb_ene.o veclib.o gb_force.o
> > timers.o pmemd_lib.o runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o
> > random.o degcnt.o erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o
> > master_setup.o pme_alltasks_setup.o pme_setup.o ene_frc_splines.o
> > gb_alltasks_setup.o nextprmtop_section.o angles_ub.o dihedrals_imp.o
> cmap.o
> > charmm.o charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib
> > -lcurand -lcufft -lcudart ./cuda/cuda.a
> > /home/adam/amber11/lib/libnetcdf.a
> > ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Clone() const':
> > gpu.cpp:(.text._ZNK3MPI9Intracomm5CloneEv[MPI::Intracomm::Clone()
> > const]+0x27): undefined reference to `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Sub(bool const*)':
> > gpu.cpp:(.text._ZN3MPI8Cartcomm3SubEPKb[MPI::Cartcomm::Sub(bool
> > const*)]+0x83): undefined reference to `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create_graph(int, int
> > const*, int const*, bool) const':
> >
> gpu.cpp:(.text._ZNK3MPI9Intracomm12Create_graphEiPKiS2_b[MPI::Intracomm::Create_graph(int,
> > int const*, int const*, bool) const]+0x27): undefined reference to
> > `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Op::Init(void (*)(void const*,
> > void*, int, MPI::Datatype const&), bool)':
> >
> gpu.cpp:(.text._ZN3MPI2Op4InitEPFvPKvPviRKNS_8DatatypeEEb[MPI::Op::Init(void
> > (*)(void const*, void*, int, MPI::Datatype const&), bool)]+0x1f):
> undefined
> > reference to `ompi_mpi_cxx_op_intercept'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Graphcomm::Clone() const':
> > gpu.cpp:(.text._ZNK3MPI9Graphcomm5CloneEv[MPI::Graphcomm::Clone()
> > const]+0x24): undefined reference to `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Clone() const':
> > gpu.cpp:(.text._ZNK3MPI8Cartcomm5CloneEv[MPI::Cartcomm::Clone()
> > const]+0x24): undefined reference to `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create_cart(int, int
> > const*, bool const*, bool) const':
> >
> gpu.cpp:(.text._ZNK3MPI9Intracomm11Create_cartEiPKiPKbb[MPI::Intracomm::Create_cart(int,
> > int const*, bool const*, bool) const]+0x124): undefined reference to
> > `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Intercomm::Merge(bool)':
> >
> gpu.cpp:(.text._ZN3MPI9Intercomm5MergeEb[MPI::Intercomm::Merge(bool)]+0x26):
> > undefined reference to `MPI::Comm::Comm()'
> > ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Split(int, int)
> const':
> > gpu.cpp:(.text._ZNK3MPI9Intracomm5SplitEii[MPI::Intracomm::Split(int,
> int)
> > const]+0x24): undefined reference to `MPI::Comm::Comm()'
> >
> ./cuda/cuda.a(gpu.o):gpu.cpp:(.text._ZNK3MPI9Intracomm6CreateERKNS_5GroupE[MPI::Intracomm::Create(MPI::Group
> > const&) const]+0x27): more undefined references to `MPI::Comm::Comm()'
> > follow
> > ./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI3WinE[vtable for MPI::Win]+0x48):
> > undefined reference to `MPI::Win::Free()'
> > ./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI8DatatypeE[vtable for
> > MPI::Datatype]+0x78): undefined reference to `MPI::Datatype::Free()'
> > collect2: ld returned 1 exit status
> > make[2]: *** [pmemd.cuda.MPI] Error 1
> > make[2]: Leaving directory `/home/adam/amber11/src/pmemd/src'
> > make[1]: *** [cuda_parallel] Error 2
> > make[1]: Leaving directory `/home/adam/amber11/src/pmemd'
> > make: *** [cuda_parallel] Error 2
> > adam.adam-MS-7750:~/amber11/src$
> >
> > Hope you can help,
> > Adam
> >
> > ------------------------------
> > *From:* Jason Swails <jason.swails.gmail.com>
> > *To:* Adam Jion <adamjion.yahoo.com>; AMBER Mailing List <
> > amber.ambermd.org>
> > *Sent:* Saturday, March 24, 2012 11:17 PM
> > *Subject:* Re: [AMBER] Error Compiling pmemd.cuda.MPI (i.e. Multiple
> GPUs)
>
> >
> > It would help to see more of your error message (i.e. the compile line
> > that failed, so we know what directories were searched for in the include
> > path).
> >
> > Another option is to set MPI_HOME (such that mpif90 is in
> > $MPI_HOME/bin/mpif90), and re-run configure
> >
> > (./configure -mpi -cuda [gnu|intel])
> >
> > This should have been grabbed by default inside configure, but it's
> > possible you have a funky configuration.
> >
> > HTH,
> > Jason
> >
> > P.S., alternatively, if you know where this include file lives, just add
> > that directory to the NVCC compiler flags in config.h and just re-run
> "make"
> >
> > On Sat, Mar 24, 2012 at 9:03 AM, Adam Jion <adamjion.yahoo.com> wrote:
> >
> > Hi!
> >
> > I have problems compiling pmemd.cuda.MPI. (However, the single gpu
> version
> > -pmemd.cuda - works)
> > The error log is:
> >
> > In file included from gpu.h:15,
> > from kForcesUpdate.cu:14:
> > gputypes.h:30: fatal error: mpi.h: No such file or directory
> > compilation terminated.
> > make[3]: *** [kForcesUpdate.o] Error 1
> > make[3]: Leaving directory `/home/adam/amber11/src/pmemd/src/cuda'
> > make[2]: *** [-L/usr/local/cuda/lib64] Error 2
> > make[2]: Leaving directory `/home/adam/amber11/src/pmemd/src'
> > make[1]: *** [cuda_parallel] Error 2
> > make[1]: Leaving directory `/home/adam/amber11/src/pmemd'
> > make: *** [cuda_parallel] Error 2
> >
> >
> > Any help will be much appreciated,
> > Adam
> >
> > ps. All the bugfixes have been applied. I have managed to combine both
> > serial and parallel versions of Amber11 and AmberTools 1.5 without
> > problems. My compilers are gcc-4.4, g++-4.4, gfortran-4.4.
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> >
> >
> > --
> > Jason M. Swails
> > Quantum Theory Project,
> > University of Florida
> > Ph.D. Candidate
> > 352-392-4032
> >
> >
> >
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
>


-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Mar 24 2012 - 20:00:03 PDT
Custom Search