[AMBER] R: Error: unspecified launch failure launching kernel kReduceForces

From: Giovanni Pavan <giovanni.pavan.supsi.ch>
Date: Mon, 2 Apr 2012 17:13:42 +0200

Dear Aron,
thank you for your reply.
But honestly I am not sure that this is a memory issue.
In fact the system is not that big --> 135685 atoms only.

Moreover the same GPU cards has ever worked also for huger system...so...
Any other suggestion :-)
Best
giovanni

Dr. Giovanni M. Pavan

Laboratory of Applied Mathematics and Physics (LaMFI)
University of Applied Sciences of Southern Switzerland (SUPSI)
Centro Galleria 2, Manno 6928, Switzerland.
e-mail: giovanni.pavan.supsi.ch
web: http://www.dti.supsi.ch/~pavan/
skype: giovanni_pavan
phone: +41 58 666 65 60

-----Messaggio originale-----
Da: Aron Broom [mailto:broomsday.gmail.com]
Inviato: luned́, 2. aprile 2012 15:47
A: giovanni.pavan.supsi.ch; AMBER Mailing List
Oggetto: Re: [AMBER] Error: unspecified launch failure launching kernel
kReduceForces

You should test the memory of your cards. You can get it from the team that
makes OpenMM (https://simtk.org/home/memtest). I've found the same error on
a GTX580, but it never occurred on a GTX570 or M2070. Another way to test
the memory thing is to see if you can start a smaller system MD simulation
on those cards (maybe just one water molecule to be extreme).

~Aron

On Mon, Apr 2, 2012 at 6:34 AM, Giovanni Pavan
<giovanni.pavan.supsi.ch>wrote:

> Dear all,
>
>
>
> I am contacting you because I am obtaining a strange error during a
> simulation on GPU (I obtain the same error both with C2050 and GTX580).
>
> My system has inside a gadolinium DOTA molecule - the Gd atom has 8
> coordination bonds parametrized according to literature.
>
>
>
> The problem is that the system passes successfully the minimization
> step with reasonable energy and everything seems to be ok.
>
> When MD is started I obtain the following error:
>
>
>
> Error: unspecified launch failure launching kernel kReduceForces
>
> cudaFree GpuBuffer::Deallocate failed unspecified launch failure
>
> At line 109 of file inpcrd_dat.f90 (unit = 9, file =
> 'heat1_G3_DOTA.rst')
>
> Fortran runtime error: End of file
>
> STOP PMEMD Terminated Abnormally!
>
>
>
> The input file of MD 1st step is:
>
>
>
> &cntrl
>
> imin=0,
>
> irest=0,
>
> ntx=1,
>
> nstlim=10000,
>
> dt=0.001,
>
> ntc=2,
>
> ntf=2,
>
> cut=8.0,
>
> ntb=1,
>
> ntpr=1000,
>
> ntwx=1000,
>
> ntt=3,
>
> gamma_ln=2.0,
>
> temp0=60,
>
> ntr=1,
>
> ig=-1,
>
> ioutfm=1,
>
> /
>
> 2.0
>
>
> RES 1 109
>
> END
>
>
> END
>
>
>
> The strange thing is that the error is the same in presence or absence
> of
> iwrap=1 (which was indicated recently as possible source of such an
> error on this mailing list
> http://archive.ambermd.org/201111/0272.html).
>
> Another strange thing is that if the same MD step is run with pmemd
> instead of pmemd.cuda, no error is reported.
>
> Curiously, however the energies reported are quite different!
>
> In fact PMEMD reports "usual/safe" negative values, while pmemd.cuda
> reports a "worse" energetic situation.
>
>
>
> PMEMD.CUDA run:
>
>
>
> NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 0.00 PRESS =
> 0.0
>
> Etot = 1108141.2453 EKtot = 0.0000 EPtot =
> 1108141.2453
>
> BOND = 9126.0102 ANGLE = 3780.4845 DIHED =
> 2107.0903
>
> 1-4 NB = 1401.2256 1-4 EEL = -42904.2608 VDWAALS =
> 1773510.4445
>
> EELEC = -638879.7492 EHBOND = 0.0000 RESTRAINT =
> 0.0000
>
>
> ----------------------------------------------------------------------
> ------
> --
>
>
>
> Normal PMEMD run:
>
>
>
> NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 0.00 PRESS =
> 0.0
>
> Etot = -585383.3566 EKtot = 0.0000 EPtot =
> -585383.3566
>
> BOND = 9126.0102 ANGLE = 3780.4845 DIHED =
> 2107.0903
>
> 1-4 NB = 1401.2256 1-4 EEL = -42904.2608 VDWAALS =
> 72165.7215
>
> EELEC = -631059.6280 EHBOND = 0.0000 RESTRAINT =
> 0.0000
>
> Ewald error estimate: 0.2749E-03
>
>
> ----------------------------------------------------------------------
> ------
> --
>
>
>
> This looks like a pmemd/pmemd.cuda problem rather than something wrong
> with the system (but maybe I am wrong).
>
> How is it possible that starting from the very same configuration, the
> same system reports energies that are so big?
>
> Is this somehow related to the error encountered and reported above?
>
> Am I missing something?
>
>
>
> I hope really that you have any suggestion on all this.
>
> Looking forward to receive your prompt feedback on this issue.
>
> Have a nice day,
>
> Bye
>
> giovanni
>
>
>
> Dr. Giovanni M. Pavan
>
>
>
> Laboratory of Applied Mathematics and Physics (LaMFI)
>
> University of Applied Sciences of Southern Switzerland (SUPSI)
>
> Centro Galleria 2, Manno 6928, Switzerland.
>
> e-mail: giovanni.pavan.supsi.ch
>
> web: http://www.dti.supsi.ch/~pavan/
>
> skype: giovanni_pavan
>
> phone: +41 58 666 65 60
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



--
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Apr 02 2012 - 08:30:04 PDT
Custom Search