Re: [AMBER] Test results for amber-cuda, single node, single GPU, Tesla C2070

From: Scott Le Grand <varelse2005.gmail.com>
Date: Wed, 25 May 2011 13:52:39 -0700

And of course I just noticed they just shipped the final release. So
all of what I said stands for verification w/r to AMBER 11 as that's a
done deal, but I suspect Ross will have to retest for AMBER 12 and,
frustratingly, even for the next patch.

And that means I have to patch the radix sort which is broken under
4.0 this weekend. Sigh.

Now I'd personally stick with CUDA 3.2 as I trust it and CUDA 4.0
brings zero zip nada null to the AMBER game but it does bring bragging
rights for using the newest and greatest toolkit (and I know some
people need that feeling so um whatev')

Scott

On Wed, May 25, 2011 at 1:47 PM, Scott Le Grand <varelse2005.gmail.com> wrote:
> Using CUDA 4.0 is your problem.  The test cases used CUDA 3.2.  CUDA
> 4.0 is a prerelease build (RC2) undergoing beta testing and it's
> therefore unreliable IMO at this time (Hear me now, believe me later).
>  I would suggest using CUDA 3.2 for the immediate future.  if you wish
> to be a beta tester, then of course, be my guest, but since the
> compilers change constantly, you're going to have to eyeball these
> tests and decide for yourself whether the build is stable or not.  The
> other option is to verify using DPDP and take SPDP's functionality as
> an article of faith if DPDP passes.
>
>
>
> On Wed, May 25, 2011 at 1:35 PM, Paul Rigor <paul.rigor.uci.edu> wrote:
>> Hi gang,
>>
>> The parallel and serial CPU versions compile just fine. The GPU installation
>> directions are also out of sync, ie, it refers to "cuda_parallel" target,
>> whereas the distributed makefile does not contain such a target.
>>
>> I'm using the latest CUDA toolkit (4.x) with the latest drivers for the
>> C2070. Our sysadmin should have obtain the latest amber11/ambertools 1.5.0
>> sources as well (but I'll double check).
>>
>>  CUDA Driver Version:                           4.0
>>  CUDA Capability Major/Minor version number:    2.0
>>  Device has ECC support enabled:                Yes
>>
>>
>> The packaged netcdf was not being linked correctly and we were getting
>> unnamed reference errors. Its actual compilation and library generation
>> works fine:
>>
>> /usr/bin/install -c .libs/libnetcdf.a
>> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.achmod 644
>> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.aranlib
>> /extra/dock2/tools/amber/md/11-1.5.0/AmberTools/lib/libnetcdf.a
>>
>>
>> But then we get the following error messages once compiling pmemd:
>>
>>
>> make[3]: Leaving directory
>> `/extra/dock2/tools/amber/md/11-1.5.0/src/pmemd/src/cuda'
>> gfortran   -DCUDA -o pmemd.cuda gbl_constants.o gbl_datatypes.o state_info.o
>> file_io_dat.o mdin_ctrl_dat.o mdin_ewald_dat.o mdin_debugf_dat.o
>> prmtop_dat.o inpcrd_dat.o dynamics_dat.o img.o parallel_dat.o parallel.o
>> gb_parallel.o pme_direct.o pme_recip_dat.o pme_slab_recip.o pme_blk_recip.o
>> pme_slab_fft.o pme_blk_fft.o pme_fft_dat.o fft1d.o bspline.o pme_force.o
>> pbc.o nb_pairlist.o nb_exclusions.o cit.o dynamics.o bonds.o angles.o
>> dihedrals.o extra_pnts_nb14.o runmd.o loadbal.o shake.o prfs.o mol_list.o
>> runmin.o constraints.o axis_optimize.o gb_ene.o veclib.o gb_force.o timers.o
>> pmemd_lib.o runfiles.o file_io.o bintraj.o pmemd_clib.o pmemd.o random.o
>> degcnt.o erfcfun.o nmr_calls.o nmr_lib.o get_cmdline.o master_setup.o
>> pme_alltasks_setup.o pme_setup.o ene_frc_splines.o gb_alltasks_setup.o
>> nextprmtop_section.o angles_ub.o dihedrals_imp.o cmap.o charmm.o
>> charmm_gold.o -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcufft -lcudart
>> ./cuda/cuda.a
>> mdin_ctrl_dat.o: In function `__mdin_ctrl_dat_mod__bcast_mdin_ctrl_dat':
>> mdin_ctrl_dat.f90:(.text+0x98d8): undefined reference to `mpi_bcast_'
>> mdin_ctrl_dat.f90:(.text+0x9903): undefined reference to `mpi_bcast_'
>> mdin_ewald_dat.o: In function `__mdin_ewald_dat_mod__bcast_mdin_ewald_dat':
>> mdin_ewald_dat.f90:(.text+0x1c7f): undefined reference to `mpi_bcast_'
>> mdin_ewald_dat.f90:(.text+0x1cab): undefined reference to `mpi_bcast_'
>> mdin_debugf_dat.o: In function
>> `__mdin_debugf_dat_mod__bcast_mdin_debugf_dat':
>> mdin_debugf_dat.f90:(.text+0x2eb): undefined reference to `mpi_bcast_'
>>
>> Thanks for your help!!
>>
>> Paul
>> --
>> Paul Rigor
>> http://www.ics.uci.edu/~prigor
>>
>>
>>
>> On Wed, May 25, 2011 at 10:54 AM, Scott Le Grand <varelse2005.gmail.com>wrote:
>>
>>> There is no I/O in the CUDA code whatsoever.  All control of overall
>>> program flow and execution is done through the existing FORTRAN
>>> infrastructure.  The CUDA code merely hotwires the calculation and
>>> tricks the FORTRAN into thinking it's doing it itself through a
>>> variety of jiggery pokery.  That's my story and I'm sticking to it.
>>>
>>> Scott
>>>
>>> On Wed, May 25, 2011 at 10:27 AM, Jason Swails <jason.swails.gmail.com>
>>> wrote:
>>> > Just to chime in here -- NetCDF is hooked into pmemd and pmemd.cuda
>>> through
>>> > the netcdf fortran module, NOT through statically/dynamically linked
>>> > libraries.  The netcdf.mod file is taken from the AmberTools NetCDF
>>> > installation, so libnetcdf.a and libnetcdff.a shouldn't be necessary in
>>> the
>>> > config.h
>>> >
>>> > Furthermore, I don't think any file writing is done through the CUDA code
>>> > (just through the Fortran code), so you shouldn't need to add any NetCDF
>>> > libraries to any CUDA variables.  Also, libraries only ever need to be
>>> > included on the link line, so I'm not sure what changes you claim to have
>>> > had to make and how they could have possibly been necessary to build
>>> > pmemd.cuda...
>>> >
>>> > I've tested it on AmberTools 1.5 + Amber11, and it works fine for me
>>> without
>>> > having to massage the config.h file any more than AT15_Amber11.py does
>>> > already.
>>> >
>>> > All the best,
>>> > Jason
>>> >
>>> > On Wed, May 25, 2011 at 12:38 PM, Ross Walker <ross.rosswalker.co.uk>
>>> wrote:
>>> >
>>> >> Hi Paul,
>>> >>
>>> >> > Attached are the test logs running the test suite on a CentOS 5.6
>>> >> > machine
>>> >> > with a Tesla C2070.
>>> >>
>>> >> The gb_ala3 case is, I believe, just an issue with the random number
>>> stream
>>> >> being different. This can happen a lot with ntt=3 since the use of
>>> random
>>> >> numbers changes depending on the hardware configuration. Working around
>>> >> this, for test purposes, hurts performance a lot. This failure (and
>>> >> dhfr_ntb2_ntt3). The rest look to just be rounding differences so I
>>> think
>>> >> you are good. I am not sure why there are so many differences though
>>> since
>>> >> the test case saved outputs should be based on C2070's. Thus given the
>>> >> modifications you made below I would be a little wary.
>>> >>
>>> >> > There was a bit of massaging to compile the cuda config. In fact, I
>>> had
>>> >> > to
>>> >> > modify PMEMD_CU_FLAGS to include the arlibs of libnetcdf.a and
>>> >> > libnetcdff.a
>>> >> > along with the full path to the installation path of the shared
>>> >> > libraries.
>>> >>
>>> >> I am little confused why you needed to make these changes. Unless
>>> something
>>> >> in AMBER Tools 1.5 has messed up the GPU compilation. AMBER should
>>> compile
>>> >> it's own netcdf so you should not need to make any changes here. Did you
>>> >> also apply all the bugfixes? - I guess I should find (invent) time to
>>> try
>>> >> building AMBER 11 + patches + AmberTools 1.5 myself. - What NVIDIA cuda
>>> >> compiler version are you using?
>>> >>
>>> >> The development tree off of what AMBERTools 1.5 was based works fine for
>>> me
>>> >> and there will be a big GPU update shortly so I never bothered to test
>>> with
>>> >> AMBERTools 1.5 so it is possible someone messed something up with the
>>> netcdf
>>> >> libraries etc but I am still surprised you had to modify things to get
>>> it to
>>> >> build.
>>> >>
>>> >> Does the rest of AMBER (AmberTools and AMBER CPU etc build fine in
>>> serial
>>> >> or parallel?) - You should really make sure all the CPU compilation and
>>> >> tests run fine before trying to build the GPU version.
>>> >>
>>> >> All the best
>>> >> Ross
>>> >>
>>> >> /\
>>> >> \/
>>> >> |\oss Walker
>>> >>
>>> >> ---------------------------------------------------------
>>> >> |             Assistant Research Professor              |
>>> >> |            San Diego Supercomputer Center             |
>>> >> |             Adjunct Assistant Professor               |
>>> >> |         Dept. of Chemistry and Biochemistry           |
>>> >> |          University of California San Diego           |
>>> >> |                     NVIDIA Fellow                     |
>>> >> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
>>> >> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk  |
>>> >> ---------------------------------------------------------
>>> >>
>>> >> Note: Electronic Mail is not secure, has no guarantee of delivery, may
>>> not
>>> >> be read every day, and should not be used for urgent or sensitive
>>> issues.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> AMBER mailing list
>>> >> AMBER.ambermd.org
>>> >> http://lists.ambermd.org/mailman/listinfo/amber
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Jason M. Swails
>>> > Quantum Theory Project,
>>> > University of Florida
>>> > Ph.D. Candidate
>>> > 352-392-4032
>>> > _______________________________________________
>>> > AMBER mailing list
>>> > AMBER.ambermd.org
>>> > http://lists.ambermd.org/mailman/listinfo/amber
>>> >
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 25 2011 - 14:00:05 PDT
Custom Search