Re: [AMBER] Test results for amber-cuda, single node, single GPU, Tesla C2070

From: Paul Rigor <paul.rigor.uci.edu>
Date: Wed, 25 May 2011 13:35:10 -0700

Hi gang,

The parallel and serial CPU versions compile just fine. The GPU installation
directions are also out of sync, ie, it refers to "cuda_parallel" target,
whereas the distributed makefile does not contain such a target.

I'm using the latest CUDA toolkit (4.x) with the latest drivers for the
C2070. Our sysadmin should have obtain the latest amber11/ambertools 1.5.0
sources as well (but I'll double check).

  CUDA Driver Version: 4.0
  CUDA Capability Major/Minor version number: 2.0
  Device has ECC support enabled: Yes


The packaged netcdf was not being linked correctly and we were getting
unnamed reference errors. Its actual compilation and library generation
works fine:

/usr/bin/install -c .libs/libnetcdf.a
/extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.achmod 644
/extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.aranlib
/extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.a


But then we get the following error messages once compiling pmemd:


make[3]: Leaving directory
`/extra/dock2/tools/amber/md/11-1.5.0/src/pmemd/src/cuda'
gfortran -DCUDA -o pmemd.cuda gbl_constants.o gbl_datatypes.o state_info.o
file_io_dat.o mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o
prmtop_dat.o inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o
gb_parallel.o pme_direct.o pme_recip_dat.o pme_slab_recip.o pme_blk_recip.o
pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o bspline.o pme_force.o
pbc.o nb_pairlist.o nb_exclusions.o cit.o dynamics.o bonds.o angles.o
dihedrals.o extra_pnts_nb14.o runmd.o loadbal.o shake.o prfs.o mol_list.o
runmin.o constraints.o axis_optimize.o gb_ene.o veclib.o gb_force.o timers.o
pmemd_lib.o runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o random.o
degcnt.o erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o master_setup.o
pme_alltasks_setup.o pme_setup.o ene_frc_splines.o gb_alltasks_setup.o
nextprmtop_section.o angles_ub.o dihedrals_imp.o cmap.o charmm.o
charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcufft -lcudart
./cuda/cuda.a
mdin_ctrl_dat.o: In function `__mdin_ctrl_dat_mod__bcast_mdin_ctrl_dat':
mdin_ctrl_dat.f90:(.text+0x98d8): undefined reference to `mpi_bcast_'
mdin_ctrl_dat.f90:(.text+0x9903): undefined reference to `mpi_bcast_'
mdin_ewald_dat.o: In function `__mdin_ewald_dat_mod__bcast_mdin_ewald_dat':
mdin_ewald_dat.f90:(.text+0x1c7f): undefined reference to `mpi_bcast_'
mdin_ewald_dat.f90:(.text+0x1cab): undefined reference to `mpi_bcast_'
mdin_debugf_dat.o: In function
`__mdin_debugf_dat_mod__bcast_mdin_debugf_dat':
mdin_debugf_dat.f90:(.text+0x2eb): undefined reference to `mpi_bcast_'

Thanks for your help!!

Paul
--
Paul Rigor
http://www.ics.uci.edu/~prigor
On Wed, May 25, 2011 at 10:54 AM, Scott Le Grand <varelse2005.gmail.com>wrote:
> There is no I/O in the CUDA code whatsoever.  All control of overall
> program flow and execution is done through the existing FORTRAN
> infrastructure.  The CUDA code merely hotwires the calculation and
> tricks the FORTRAN into thinking it's doing it itself through a
> variety of jiggery pokery.  That's my story and I'm sticking to it.
>
> Scott
>
> On Wed, May 25, 2011 at 10:27 AM, Jason Swails <jason.swails.gmail.com>
> wrote:
> > Just to chime in here -- NetCDF is hooked into pmemd and pmemd.cuda
> through
> > the netcdf fortran module, NOT through statically/dynamically linked
> > libraries.  The netcdf.mod file is taken from the AmberTools NetCDF
> > installation, so libnetcdf.a and libnetcdff.a shouldn't be necessary in
> the
> > config.h
> >
> > Furthermore, I don't think any file writing is done through the CUDA code
> > (just through the Fortran code), so you shouldn't need to add any NetCDF
> > libraries to any CUDA variables.  Also, libraries only ever need to be
> > included on the link line, so I'm not sure what changes you claim to have
> > had to make and how they could have possibly been necessary to build
> > pmemd.cuda...
> >
> > I've tested it on AmberTools 1.5 + Amber11, and it works fine for me
> without
> > having to massage the config.h file any more than AT15_Amber11.py does
> > already.
> >
> > All the best,
> > Jason
> >
> > On Wed, May 25, 2011 at 12:38 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
> >
> >> Hi Paul,
> >>
> >> > Attached are the test logs running the test suite on a CentOS 5.6
> >> > machine
> >> > with a Tesla C2070.
> >>
> >> The gb_ala3 case is, I believe, just an issue with the random number
> stream
> >> being different. This can happen a lot with ntt=3 since the use of
> random
> >> numbers changes depending on the hardware configuration. Working around
> >> this, for test purposes, hurts performance a lot. This failure (and
> >> dhfr_ntb2_ntt3). The rest look to just be rounding differences so I
> think
> >> you are good. I am not sure why there are so many differences though
> since
> >> the test case saved outputs should be based on C2070's. Thus given the
> >> modifications you made below I would be a little wary.
> >>
> >> > There was a bit of massaging to compile the cuda config. In fact, I
> had
> >> > to
> >> > modify PMEMD_CU_FLAGS to include the arlibs of libnetcdf.a and
> >> > libnetcdff.a
> >> > along with the full path to the installation path of the shared
> >> > libraries.
> >>
> >> I am little confused why you needed to make these changes. Unless
> something
> >> in AMBER Tools 1.5 has messed up the GPU compilation. AMBER should
> compile
> >> it's own netcdf so you should not need to make any changes here. Did you
> >> also apply all the bugfixes? - I guess I should find (invent) time to
> try
> >> building AMBER 11 + patches + AmberTools 1.5 myself. - What NVIDIA cuda
> >> compiler version are you using?
> >>
> >> The development tree off of what AMBERTools 1.5 was based works fine for
> me
> >> and there will be a big GPU update shortly so I never bothered to test
> with
> >> AMBERTools 1.5 so it is possible someone messed something up with the
> netcdf
> >> libraries etc but I am still surprised you had to modify things to get
> it to
> >> build.
> >>
> >> Does the rest of AMBER (AmberTools and AMBER CPU etc build fine in
> serial
> >> or parallel?) - You should really make sure all the CPU compilation and
> >> tests run fine before trying to build the GPU version.
> >>
> >> All the best
> >> Ross
> >>
> >> /\
> >> \/
> >> |\oss Walker
> >>
> >> ---------------------------------------------------------
> >> |             Assistant Research Professor              |
> >> |            San Diego Supercomputer Center             |
> >> |             Adjunct Assistant Professor               |
> >> |         Dept. of Chemistry and Biochemistry           |
> >> |          University of California San Diego           |
> >> |                     NVIDIA Fellow                     |
> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk  |
> >> ---------------------------------------------------------
> >>
> >> Note: Electronic Mail is not secure, has no guarantee of delivery, may
> not
> >> be read every day, and should not be used for urgent or sensitive
> issues.
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> > --
> > Jason M. Swails
> > Quantum Theory Project,
> > University of Florida
> > Ph.D. Candidate
> > 352-392-4032
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 25 2011 - 14:00:02 PDT
Custom Search