Re: [AMBER] Problems compiling pmemd.cuda.MPI for InfiniBand

From: Sasha Buzko <obuzko.ucla.edu>
Date: Tue, 14 Dec 2010 15:57:08 -0800

Hi Ross,
here's the output of the three commands:
[sasha.node1 skp2_NPT]$ mpicc -show
gcc -I/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/include -pthread
-L/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/lib -lmpi -lopen-rte
-lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
[sasha.node1 skp2_NPT]$ mpicxx -show
g++ -I/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/include -pthread
-L/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/lib -lmpi_cxx -lmpi
-lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl
[sasha.node1 skp2_NPT]$ mpif90 -show
gfortran -I/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/include
-pthread -I/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/lib
-L/usr/local/openmpi/openmpi-1.4.3/x86_64/ib/gcc/lib -lmpi_f90 -lmpi_f77
-lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl

It still could be something CUDA code-related, since the conventional
CPU code compiled and ran with no issue over IB using the same OpenMPI
libraries.
Has anyone successfully compiled pmemd.cuda.MPI against OpenMPI over
Infiniband?
Thanks

Sasha


Ross Walker wrote:
> Hi Sasha,
>
> This looks like some kind of namespace mangling between the C++ and C /
> Fortran linking. Are you sure the compiler is the same for C++, C and
> Fortran? Try:
>
> mpicc -show
> mpicxx -show
> mpif90 -show
>
> The other 'guess' is that the MPI library was built without C++ support.
>
> All the best
> Ross
>
>
>> -----Original Message-----
>> From: Sasha Buzko [mailto:obuzko.ucla.edu]
>> Sent: Tuesday, December 14, 2010 3:02 PM
>> To: AMBER Mailing List
>> Subject: [AMBER] Problems compiling pmemd.cuda.MPI for InfiniBand
>>
>> Dear Amber developers,
>> I'm having some compilation problems while trying to compile CUDA source
>> for Infiniband. The last 50 lines of the output are given below the
>>
> message.
>
>> The OS is CentOS 5.5. Here's the full output of the uname -a command:
>> Linux master 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> I'm using GNU compiler, version 4.1.2, and OpenMPI libraries compiled
>> with the same. All environment variables are set correctly, and I run
>> make clean before any compilation.
>> Compilation of parallel pmemd.cuda.MPI for local execution went well and
>> the code works as advertised (using mpich2). At the same time,
>> compilation of conventional CPU code for pmemd.MPI over Infiniband went
>> fine as well - using the same OpenMPI libraries. I tested its execution
>> over multiple nodes with QDR IB, and everything looks normal.
>>
>> So the problem seems to be restricted to the parallel CUDA version. Are
>> there any limitations in the source code that could appear in the IB
>> version? Is there anything I could do to diagnose the problem?
>> Thanks in advance for any suggestions.
>>
>> Best,
>>
>> Sasha
>>
>>
>>
>>
>> ar: creating cuda.a
>> a - cuda_info.o
>> a - gpu.o
>> a - gputypes.o
>> a - kForcesUpdate.o
>> a - kCalculateLocalForces.o
>> a - kCalculateGBBornRadii.o
>> a - kCalculatePMENonbondEnergy.o
>> a - radixsort.o
>> a - radixsort_c.o
>> a - kCalculateGBNonbondEnergy1.o
>> a - kCalculateGBNonbondEnergy2.o
>> a - kShake.o
>> a - kNeighborList.o
>> a - kPMEInterpolation.o
>> a - cudpp_scan.o
>> a - cudpp_scan_c.o
>> make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make -C ./cuda
>> make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make[3]: `cuda.a' is up to date.
>> make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make -C ./cuda
>> make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make[3]: `cuda.a' is up to date.
>> make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make -C ./cuda
>> make[3]: Entering directory `/usr/local/amber11/src/pmemd/src/cuda'
>> make[3]: `cuda.a' is up to date.
>> make[3]: Leaving directory `/usr/local/amber11/src/pmemd/src/cuda'
>> mpif90 -O3 -DCUDA -DMPI -DMPICH_IGNORE_CXX_SEEK -o
>> pmemd.cuda.MPI
>> gbl_constants.o gbl_datatypes.o state_info.o file_io_dat.o
>> mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o prmtop_dat.o
>> inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o
>> gb_parallel.o pme_direct.o pme_recip_dat.o pme_slab_recip.o
>> pme_blk_recip.o pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o
>> bspline.o pme_force.o pbc.o nb_pairlist.o nb_exclusions.o cit.o
>> dynamics.o bonds.o angles.o dihedrals.o extra_pnts_nb14.o runmd.o
>> loadbal.o shake.o prfs.o mol_list.o runmin.o constraints.o
>> axis_optimize.o gb_ene.o veclib.o gb_force.o timers.o pmemd_lib.o
>> runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o random.o degcnt.o
>> erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o master_setup.o
>> pme_alltasks_setup.o pme_setup.o ene_frc_splines.o gb_alltasks_setup.o
>> nextprmtop_section.o angles_ub.o dihedrals_imp.o cmap.o charmm.o
>> charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcufft
>> -lcudart ./cuda/cuda.a
>> ./cuda/cuda.a(gpu.o): In function `MPI::Op::Init(void (*)(void const*,
>> void*, int, MPI::Datatype const&), bool)':
>> gpu.cpp:(.text._ZN3MPI2Op4InitEPFvPKvPviRKNS_8DatatypeEEb[MPI::Op::I
>> nit(void
>> (*)(void const*, void*, int, MPI::Datatype const&), bool)]+0x19):
>> undefined reference to `ompi_mpi_cxx_op_intercept'
>> ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create(MPI::Group
>> const&) const':
>> gpu.cpp:(.text._ZNK3MPI9Intracomm6CreateERKNS_5GroupE[MPI::Intraco
>> mm::Create(MPI::Group
>> const&) const]+0x2a): undefined reference to `MPI::Comm::Comm()'
>> ./cuda/cuda.a(gpu.o): In function `MPI::Graphcomm::Clone() const':
>> gpu.cpp:(.text._ZNK3MPI9Graphcomm5CloneEv[MPI::Graphcomm::Clone()
>> const]+0x25): undefined reference to `MPI::Comm::Comm()'
>> ./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Clone() const':
>> gpu.cpp:(.text._ZNK3MPI8Cartcomm5CloneEv[MPI::Cartcomm::Clone()
>> const]+0x25): undefined reference to `MPI::Comm::Comm()'
>> ./cuda/cuda.a(gpu.o): In function `MPI::Intracomm::Create_graph(int, int
>> const*, int const*, bool) const':
>> gpu.cpp:(.text._ZNK3MPI9Intracomm12Create_graphEiPKiS2_b[MPI::Intraco
>> mm::Create_graph(int,
>> int const*, int const*, bool) const]+0x2b): undefined reference to
>> `MPI::Comm::Comm()'
>> ./cuda/cuda.a(gpu.o): In function `MPI::Cartcomm::Sub(bool const*)':
>> gpu.cpp:(.text._ZN3MPI8Cartcomm3SubEPKb[MPI::Cartcomm::Sub(bool
>> const*)]+0x76): undefined reference to `MPI::Comm::Comm()'
>> ./cuda/cuda.a(gpu.o):gpu.cpp:(.text._ZNK3MPI9Intracomm11Create_cartEiP
>> KiPKbb[MPI::Intracomm::Create_cart(int,
>> int const*, bool const*, bool) const]+0x8f): more undefined references
>> to `MPI::Comm::Comm()' follow
>> ./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI3WinE[vtable for
>> MPI::Win]+0x48):
>> undefined reference to `MPI::Win::Free()'
>> ./cuda/cuda.a(gpu.o):(.rodata._ZTVN3MPI8DatatypeE[vtable for
>> MPI::Datatype]+0x78): undefined reference to `MPI::Datatype::Free()'
>> collect2: ld returned 1 exit status
>> make[2]: *** [pmemd.cuda.MPI] Error 1
>> make[2]: Leaving directory `/usr/local/amber11/src/pmemd/src'
>> make[1]: *** [cuda_parallel] Error 2
>> make[1]: Leaving directory `/usr/local/amber11/src/pmemd'
>> make: *** [cuda_parallel] Error 2
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Dec 14 2010 - 16:00:04 PST
Custom Search