Re: [AMBER] Failure kReduceSoluteCOM with GPU

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 27 Jul 2011 10:28:18 -0700

Hi Fabricio,

Please take a look at the following which explains what md5sum's are:
http://en.wikipedia.org/wiki/Md5sum

In summary it creates an 'almost' unique fingerprint of a file. Thus if I
run md5sum on the files in my directory and you run md5sum on the files in
your directory one can compare the fingerprints produced. If they are the
same then we know the files are identical. The following is the list of
md5sum's for the files in my cuda directory which represents the currently
fully up to date released copy of AMBER with all bugfixes applied. You
should go to your machine and do the following:

cd $AMBERHOME/src
make clean
cd pmemd/src/cuda
md5sum *

And then see if the fingerprint given (the bunch of letters and numbers
before each file) matches those I list below for each file. If they do then
we know your patch was all applied correctly and your system may be
highlighting a real bug in the code. Note the GTX275 and GTX460's are VERY
different chip architectures hence why a subtle bug such as this may only
manifest itself on one card and not the other.

All the best
Ross

foo.linux-jh9j:~/amber11_as_of_jul_22/src/pmemd/src/cuda> md5sum *
md5sum: B40C: Is a directory
f4ed79de194d836246009d5c29051574 cuda_info.fpp
a9e4f660fcb5347b1273a8e3f76d3e74 gpu.cpp
307e64e078aa5f1f22bd78fd224c9f4b gpu.h
9e6a4f93e46046cda29369feb0dd32e8 gputypes.cpp
46f8ccf2bbee063ff35a73945b16a3a2 gputypes.h
90ba8d068522a00074707a529469f5ea kCalculateGBBornRadii.cu
97fbbcfb8a3833509d94072ecab05643 kCalculateGBNonbondEnergy1.cu
79fb7a5bba2a19ba351a7dd5996d31fc kCalculateGBNonbondEnergy2.cu
67a458e51a76162edbcc907e7135500c kCalculateLocalForces.cu
ce308f4fbe9468d5505beb0099d58e76 kCalculatePMENonbondEnergy.cu
9b240d418e391a71b590e6dc3bc3b0ff kCCF.h
5561a56bc236291cb87b4770453d67a4 kCLF.h
86f220029e3a943a186ebcfd16e2dcd9 kCPNE.h
9905ed2e705bccf1ae705279d85d0e57 kForcesUpdate.cu
edf2d74af7a4d401ccecc7bfa6d036c3 kNeighborList.cu
fd65d023597024a68565c5a0e5ffd86c kNTPKernels.h
49f952b429618228fca8e23f44223c58 kPGGW.h
4aea91b87cbb3cf62b9fddafe607ab48 kPGS.h
9c5951cdf94402d2c0396b74498f72f5 kPMEInterpolation.cu
46f01611524128ea428c069ef58bd421 kPSSE.h
ada7d510598c88ed4adb8d32a9dbf73d kRandom.h
eefe9bd32e04ba2bbe2eb5611a6464bd kShake.cu
b07e184d2840ffae27d8af5415fae04a kU.h
6947e1fae477c0bb9c637062a0ddbfd8 Makefile
e5a6173273e6812669c21abcd1530226 Makefile.advanced

> -----Original Message-----
> From: Fabrício Bracht [mailto:bracht.iq.ufrj.br]
> Sent: Wednesday, July 27, 2011 8:53 AM
> To: AMBER Mailing List; Scott Brozell
> Subject: Re: [AMBER] Failure kReduceSoluteCOM with GPU
>
> Hi,
> I've only found $AMBERHOME/AmberTools/src/configure.rej .
> I've checked the files that were supposed to be patched by bugfix.11,
> but wasn't able to confirm if they were patched or not due to my lack
> of programming knowledge. Any tips here?
> One other thing. Why is it that this simulation ran successfully on my
> GTX275 computer but has problems with my GTX460?
> Thank you
> Fabrício
>
> 2011/7/27 Scott Brozell <sbrozell.rci.rutgers.edu>:
> > Hi,
> >
> > The patch command should create a reject file: blabla.rej.
> > So look for files with a rej extension.
> > Also since in bugfix 11 there are only a few files to be patched in
> > src/pmemd/src/cuda, you could look at those files to see if the
> > patch has been applied:
> > http://ambermd.org/bugfixes/11.0/bugfix.11
> >
> > scott
> >
> > On Tue, Jul 26, 2011 at 10:07:28AM -0300, Fabrício Bracht wrote:
> >> Hi Scott. How do I check if this specific bugfix has been applied
> >> correctly? Would it be something like md5sum * in
> >> $AMBERHOME/src/pmemd/src/cuda/ . And what should I look for?
> >> Thank you
> >> Fabrício
> >>
> >> 2011/7/26 Scott Brozell <sbrozell.rci.rutgers.edu>:
> >> > Hi,
> >> >
> >> > This looks like a problem addressed by bugfix.11.
> >> > I have not been following your threads closely,
> >> > but i read that you were having problems with the bugfixes.
> >> > You might inspect the files listed in bugfix.11 to determine
> >> > whether the bugfixes were really applied, while you are waiting
> >> > for someone that as been following your threads closely to reply.
> >> >
> >> > scott
> >> >
> >> > On Tue, Jul 26, 2011 at 12:44:10AM -0300, Fabrício Bracht wrote:
> >> >> Since I finally was able to compile amber11 with cuda support on
> my
> >> >> for my gtx460, I thought everything was fine, but it seems that
> now I
> >> >> have to set a few things in order to get my system running again.
> Let
> >> >> me explain more.
> >> >> I was simulating a protein inside a micele. I had a few tens of
> >> >> nanoseconds simulated on a gtx275. The system is comprised of
> water,
> >> >> organic solvent, surfactant, counterions and my protein (aprox.
> 60000
> >> >> atoms). When I tried to start a simulation using my restart files
> from
> >> >> the GTX275 on my gtx460 machine, I got the following error.
> >> >> Error: unspecified launch failure launching kernel
> kReduceSoluteCOM
> >> >> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
> >> >>
> >> >> I thought it might have something to do with a problem in the
> restart
> >> >> file or something like this, so I recreated the inpcrd and prmtop
> >> >> files for the last configuration and tried to start a new fresh
> one in
> >> >> my gtx460 machine. Well, it didn't work out. I got the same error
> >> >> lines again.
> >> >> Here is my configuration file.
> >> >> MD parameters
> >> >>  &cntrl
> >> >>   imin   = 0,
> >> >>   irest  = 1,
> >> >>   ntx    = 7,
> >> >>   ntb    = 2, pres0 = 1.0, ntp = 1, taup = 2.0,
> >> >>   cut    = 9.0,
> >> >>   ntr    = 1,
> >> >>   ntc    = 2,
> >> >>   ntf    = 2,
> >> >>   tempi  = 300.0,
> >> >>   temp0  = 300.0,
> >> >>   ntt    = 3,
> >> >>   gamma_ln = 1.0,
> >> >>   nstlim = 5000000, dt = 0.002,
> >> >>   ntpr = 10000, ntwx = 10000, ntwr = 1000
> >> >>  /
> >> >> Restraints
> >> >> 5.0
> >> >> RES 1 317
> >> >> END
> >> >> END
> >> >>
> >> >> And here is the command line:
> >> >> pmemd.cuda -O -i md.in -c micel2.3.inpcrd -p micel2.3.prmtop -r
> >> >> md3.rst -o md3.out -ref micela2.3.inpcrd -inf md3.info -x
> md3.mdcrd
> >> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jul 27 2011 - 10:30:04 PDT
Custom Search