[AMBER] Problems compiling pmemd.cuda.MPI for InfiniBand

From: Sasha Buzko <obuzko.ucla.edu>
Date: Tue, 14 Dec 2010 15:02:02 -0800

Dear Amber developers,
I'm having some compilation problems while trying to compile CUDA source
for Infiniband. The last 50 lines of the output are given below the message.

The OS is CentOS 5.5. Here's the full output of the uname -a command:
Linux master 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010
x86_64 x86_64 x86_64 GNU/Linux

I'm using GNU compiler, version 4.1.2, and OpenMPI libraries compiled
with the same. All environment variables are set correctly, and I run
make clean before any compilation.
Compilation of parallel pmemd.cuda.MPI for local execution went well and
the code works as advertised (using mpich2). At the same time,
compilation of conventional CPU code for pmemd.MPI over Infiniband went
fine as well - using the same OpenMPI libraries. I tested its execution
over multiple nodes with QDR IB, and everything looks normal.

So the problem seems to be restricted to the parallel CUDA version. Are
there any limitations in the source code that could appear in the IB
version? Is there anything I could do to diagnose the problem?
Thanks in advance for any suggestions.

Best,

Sasha




ar: creating cuda.a
a - cuda_info.o
a - gpu.o
a - gputypes.o
a - kForcesUpdate.o
a - kCalculateLocalForces.o
a - kCalculateGBBornRadii.o
a - kCalculatePMENonbondEnergy.o
a - radixsort.o
a - radixsort_c.o
a - kCalculateGBNonbondEnergy1.o
a - kCalculateGBNonbondEnergy2.o
a - kShake.o
a - kNeighborList.o
a - kPMEInterpolation.o
a - cudpp_scan.o
a - cudpp_scan_c.o
make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
make -C ./cuda
make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
make[3]: `cuda.a' is up to date.
make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
make -C ./cuda
make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
make[3]: `cuda.a' is up to date.
make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
make -C ./cuda
make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
make[3]: `cuda.a' is up to date.
make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
mpif90 -O3 -DCUDA -DMPI -DMPICH_IGNORE_CXX_SEEK -o pmemd.cuda.MPI
gbl_constants.o gbl_datatypes.o state_info.o file_io_dat.o
mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o prmtop_dat.o
inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o
gb_parallel.o pme_direct.o pme_recip_dat.o pme_slab_recip.o
pme_blk_recip.o pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o
bspline.o pme_force.o pbc.o nb_pairlist.o nb_exclusions.o cit.o
dynamics.o bonds.o angles.o dihedrals.o extra_pnts_nb14.o runmd.o
loadbal.o shake.o prfs.o mol_list.o runmin.o constraints.o
axis_optimize.o gb_ene.o veclib.o gb_force.o timers.o pmemd_lib.o
runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o random.o degcnt.o
erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o master_setup.o
pme_alltasks_setup.o pme_setup.o ene_frc_splines.o gb_alltasks_setup.o
nextprmtop_section.o angles_ub.o dihedrals_imp.o cmap.o charmm.o
charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcufft
-lcudart ./cuda/cuda.a
./cuda/cuda.a(gpu.o): In function `MPI::Op::Init(void (*)(void const*,
void*, int, MPI::Datatype const&), bool)':
gpu.cpp:(.text._ZN3MPI2Op4InitEPFvPKvPviRKNS_8DatatypeEEb[MPI::Op::Init(void
(*)(void const*, void*, int, MPI::Datatype const&), bool)]+0x19):
undefined reference to `ompi_mpi_cxx_op_intercept'
./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create(MPI::Group
const&) const':
gpu.cpp:(.text._ZNK3MPI9Intracomm6CreateERKNS_5GroupE[MPI::Intracomm::Create(MPI::Group
const&) const]+0x2a): undefined reference to `MPI::Comm::Comm()'
./cuda/cuda.a(gpu.o): In function `MPI::Graphcomm::Clone() const':
gpu.cpp:(.text._ZNK3MPI9Graphcomm5CloneEv[MPI::Graphcomm::Clone()
const]+0x25): undefined reference to `MPI::Comm::Comm()'
./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Clone() const':
gpu.cpp:(.text._ZNK3MPI8Cartcomm5CloneEv[MPI::Cartcomm::Clone()
const]+0x25): undefined reference to `MPI::Comm::Comm()'
./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create_graph(int, int
const*, int const*, bool) const':
gpu.cpp:(.text._ZNK3MPI9Intracomm12Create_graphEiPKiS2_b[MPI::Intracomm::Create_graph(int,
int const*, int const*, bool) const]+0x2b): undefined reference to
`MPI::Comm::Comm()'
./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Sub(bool const*)':
gpu.cpp:(.text._ZN3MPI8Cartcomm3SubEPKb[MPI::Cartcomm::Sub(bool
const*)]+0x76): undefined reference to `MPI::Comm::Comm()'
./cuda/cuda.a(gpu.o):gpu.cpp:(.text._ZNK3MPI9Intracomm11Create_cartEiPKiPKbb[MPI::Intracomm::Create_cart(int,
int const*, bool const*, bool) const]+0x8f): more undefined references
to `MPI::Comm::Comm()' follow
./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI3WinE[vtable for MPI::Win]+0x48):
undefined reference to `MPI::Win::Free()'
./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI8DatatypeE[vtable for
MPI::Datatype]+0x78): undefined reference to `MPI::Datatype::Free()'
collect2: ld returned 1 exit status
make[2]: *** [pmemd.cuda.MPI] Error 1
make[2]: Leaving directory `/usr/local/amber11/src/pmemd/src'
make[1]: *** [cuda_parallel] Error 2
make[1]: Leaving directory `/usr/local/amber11/src/pmemd'
make: *** [cuda_parallel] Error 2

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Dec 14 2010 - 15:30:03 PST
Custom Search