Re: [AMBER] Test results for amber-cuda, single node, single GPU, Tesla C2070

From: Scott Le Grand <varelse2005.gmail.com>
Date: Wed, 25 May 2011 13:47:44 -0700

Using CUDA 4.0 is your problem. The test cases used CUDA 3.2. CUDA
4.0 is a prerelease build (RC2) undergoing beta testing and it's
therefore unreliable IMO at this time (Hear me now, believe me later).
 I would suggest using CUDA 3.2 for the immediate future. if you wish
to be a beta tester, then of course, be my guest, but since the
compilers change constantly, you're going to have to eyeball these
tests and decide for yourself whether the build is stable or not. The
other option is to verify using DPDP and take SPDP's functionality as
an article of faith if DPDP passes.



On Wed, May 25, 2011 at 1:35 PM, Paul Rigor <paul.rigor.uci.edu> wrote:
> Hi gang,
>
> The parallel and serial CPU versions compile just fine. The GPU installation
> directions are also out of sync, ie, it refers to "cuda_parallel" target,
> whereas the distributed makefile does not contain such a target.
>
> I'm using the latest CUDA toolkit (4.x) with the latest drivers for the
> C2070. Our sysadmin should have obtain the latest amber11/ambertools 1.5.0
> sources as well (but I'll double check).
>
>  CUDA Driver Version:                           4.0
>  CUDA Capability Major/Minor version number:    2.0
>  Device has ECC support enabled:                Yes
>
>
> The packaged netcdf was not being linked correctly and we were getting
> unnamed reference errors. Its actual compilation and library generation
> works fine:
>
> /usr/bin/install -c .libs/libnetcdf.a
> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.achmod 644
> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.aranlib
> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.a
>
>
> But then we get the following error messages once compiling pmemd:
>
>
> make[3]: Leaving directory
> `/extra/dock2/tools/amber/md/11-1.5.0/src/pmemd/src/cuda'
> gfortran   -DCUDA -o pmemd.cuda gbl_constants.o gbl_datatypes.o state_info.o
> file_io_dat.o mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o
> prmtop_dat.o inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o
> gb_parallel.o pme_direct.o pme_recip_dat.o pme_slab_recip.o pme_blk_recip.o
> pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o bspline.o pme_force.o
> pbc.o nb_pairlist.o nb_exclusions.o cit.o dynamics.o bonds.o angles.o
> dihedrals.o extra_pnts_nb14.o runmd.o loadbal.o shake.o prfs.o mol_list.o
> runmin.o constraints.o axis_optimize.o gb_ene.o veclib.o gb_force.o timers.o
> pmemd_lib.o runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o random.o
> degcnt.o erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o master_setup.o
> pme_alltasks_setup.o pme_setup.o ene_frc_splines.o gb_alltasks_setup.o
> nextprmtop_section.o angles_ub.o dihedrals_imp.o cmap.o charmm.o
> charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcufft -lcudart
> ./cuda/cuda.a
> mdin_ctrl_dat.o: In function `__mdin_ctrl_dat_mod__bcast_mdin_ctrl_dat':
> mdin_ctrl_dat.f90:(.text+0x98d8): undefined reference to `mpi_bcast_'
> mdin_ctrl_dat.f90:(.text+0x9903): undefined reference to `mpi_bcast_'
> mdin_ewald_dat.o: In function `__mdin_ewald_dat_mod__bcast_mdin_ewald_dat':
> mdin_ewald_dat.f90:(.text+0x1c7f): undefined reference to `mpi_bcast_'
> mdin_ewald_dat.f90:(.text+0x1cab): undefined reference to `mpi_bcast_'
> mdin_debugf_dat.o: In function
> `__mdin_debugf_dat_mod__bcast_mdin_debugf_dat':
> mdin_debugf_dat.f90:(.text+0x2eb): undefined reference to `mpi_bcast_'
>
> Thanks for your help!!
>
> Paul
> --
> Paul Rigor
> http://www.ics.uci.edu/~prigor
>
>
>
> On Wed, May 25, 2011 at 10:54 AM, Scott Le Grand <varelse2005.gmail.com>wrote:
>
>> There is no I/O in the CUDA code whatsoever.  All control of overall
>> program flow and execution is done through the existing FORTRAN
>> infrastructure.  The CUDA code merely hotwires the calculation and
>> tricks the FORTRAN into thinking it's doing it itself through a
>> variety of jiggery pokery.  That's my story and I'm sticking to it.
>>
>> Scott
>>
>> On Wed, May 25, 2011 at 10:27 AM, Jason Swails <jason.swails.gmail.com>
>> wrote:
>> > Just to chime in here -- NetCDF is hooked into pmemd and pmemd.cuda
>> through
>> > the netcdf fortran module, NOT through statically/dynamically linked
>> > libraries.  The netcdf.mod file is taken from the AmberTools NetCDF
>> > installation, so libnetcdf.a and libnetcdff.a shouldn't be necessary in
>> the
>> > config.h
>> >
>> > Furthermore, I don't think any file writing is done through the CUDA code
>> > (just through the Fortran code), so you shouldn't need to add any NetCDF
>> > libraries to any CUDA variables.  Also, libraries only ever need to be
>> > included on the link line, so I'm not sure what changes you claim to have
>> > had to make and how they could have possibly been necessary to build
>> > pmemd.cuda...
>> >
>> > I've tested it on AmberTools 1.5 + Amber11, and it works fine for me
>> without
>> > having to massage the config.h file any more than AT15_Amber11.py does
>> > already.
>> >
>> > All the best,
>> > Jason
>> >
>> > On Wed, May 25, 2011 at 12:38 PM, Ross Walker <ross.rosswalker.co.uk>
>> wrote:
>> >
>> >> Hi Paul,
>> >>
>> >> > Attached are the test logs running the test suite on a CentOS 5.6
>> >> > machine
>> >> > with a Tesla C2070.
>> >>
>> >> The gb_ala3 case is, I believe, just an issue with the random number
>> stream
>> >> being different. This can happen a lot with ntt=3 since the use of
>> random
>> >> numbers changes depending on the hardware configuration. Working around
>> >> this, for test purposes, hurts performance a lot. This failure (and
>> >> dhfr_ntb2_ntt3). The rest look to just be rounding differences so I
>> think
>> >> you are good. I am not sure why there are so many differences though
>> since
>> >> the test case saved outputs should be based on C2070's. Thus given the
>> >> modifications you made below I would be a little wary.
>> >>
>> >> > There was a bit of massaging to compile the cuda config. In fact, I
>> had
>> >> > to
>> >> > modify PMEMD_CU_FLAGS to include the arlibs of libnetcdf.a and
>> >> > libnetcdff.a
>> >> > along with the full path to the installation path of the shared
>> >> > libraries.
>> >>
>> >> I am little confused why you needed to make these changes. Unless
>> something
>> >> in AMBER Tools 1.5 has messed up the GPU compilation. AMBER should
>> compile
>> >> it's own netcdf so you should not need to make any changes here. Did you
>> >> also apply all the bugfixes? - I guess I should find (invent) time to
>> try
>> >> building AMBER 11 + patches + AmberTools 1.5 myself. - What NVIDIA cuda
>> >> compiler version are you using?
>> >>
>> >> The development tree off of what AMBERTools 1.5 was based works fine for
>> me
>> >> and there will be a big GPU update shortly so I never bothered to test
>> with
>> >> AMBERTools 1.5 so it is possible someone messed something up with the
>> netcdf
>> >> libraries etc but I am still surprised you had to modify things to get
>> it to
>> >> build.
>> >>
>> >> Does the rest of AMBER (AmberTools and AMBER CPU etc build fine in
>> serial
>> >> or parallel?) - You should really make sure all the CPU compilation and
>> >> tests run fine before trying to build the GPU version.
>> >>
>> >> All the best
>> >> Ross
>> >>
>> >> /\
>> >> \/
>> >> |\oss Walker
>> >>
>> >> ---------------------------------------------------------
>> >> |             Assistant Research Professor              |
>> >> |            San Diego Supercomputer Center             |
>> >> |             Adjunct Assistant Professor               |
>> >> |         Dept. of Chemistry and Biochemistry           |
>> >> |          University of California San Diego           |
>> >> |                     NVIDIA Fellow                     |
>> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
>> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk  |
>> >> ---------------------------------------------------------
>> >>
>> >> Note: Electronic Mail is not secure, has no guarantee of delivery, may
>> not
>> >> be read every day, and should not be used for urgent or sensitive
>> issues.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >
>> >
>> >
>> > --
>> > Jason M. Swails
>> > Quantum Theory Project,
>> > University of Florida
>> > Ph.D. Candidate
>> > 352-392-4032
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 25 2011 - 14:00:03 PDT
Custom Search