[AMBER] CUDA NaN error occuring at "wrapping first mol"

From: Bill Sinko <wsinko.ucsd.edu>
Date: Wed, 30 Nov 2011 15:54:31 -0800

I have noticed an error occuring in pmemd.cuda at the first instance
of the words "wrapping first mol." in larger systems (~60,000 atoms)
in the mdout file. After this error the restart and coordinate file
are filled with NaN. This same system has been run out to 160ns with
no error using pmemd.mpi. I have run smaller systems 10,000 to 15,000
atoms out to 1 microsecond with the same pmemd.cuda executable on the
same computer with no error. "wrapping first mol" is output numerous
times in the smaller system but does not cause problems.

I am running pmemd.cuda amber11 with the latest bugfixes up to 19,
cudatoolkit 4.0.17, I have 2 quad core Intel(R) Xeon(R) CPU X5472 .
3.00GHz. This error was seen using a GTX570, a GTX580 (3gb memory),
and a Tesla C2050 (running on a seperate computer with 2 quad core
Intel(R) Xeon(R) CPU W5580 . 3.20GHz)

The error is reproducible in that it occurs at the exact same time
given the same ig value and occurs as soon as the "wrapping first
mol." occurs given a random ig value. Turning iwrap off does not fix
the problem and it will occur at the same time point as the "wrapping
first mol" occured with iwrap on.

Below is the input file, and first instance of error in the output
file. I also attached the test log file from when I compiled. Any
help is much appreciated.


Thanks,

Bill


&cntrl

 timlim = 999999., nmropt = 0, imin = 0,
 ntx = 5, irest = 1, ntrx = 1, ntxo = 1,
 ntpr = 5000, ntwx = 5000, ntwv = 0, ntwe = 0,
 ioutfm = 1, ntwr = 5000,

 ntf = 2, ntb = 1,
 igb = 0,
 cut = 9, nsnb = 20,

 nstlim = 25000000, nscm = 2500, iwrap = 1,
 t = 0.0, dt = 0.002,

 temp0 = 300.0, tempi = 200.0, tautp=0.5, ig = -1,
 heat = 0.0, ntt = 1,

 ntc = 2, tol = 0.00001, jfastw = 0,

 ibelly=0, ntr=0,

&end


Everything is fine until here and the words "wrapping first mol." have
not occured yet here is the mdout when the error starts.


 NSTEP = 345000 TIME(PS) = 6470.000 TEMP(K) = 300.63 PRESS = 0.0
 Etot = -142135.0768 EKtot = 36637.2070 EPtot = -178772.2838
 BOND = 1471.6496 ANGLE = 3908.0882 DIHED = 5198.8718
 1-4 NB = 1769.8981 1-4 EEL = 17988.7091 VDWAALS = 19870.0613
 EELEC = -228979.5620 EHBOND = 0.0000 RESTRAINT = 0.0000
 ------------------------------------------------------------------------------

check COM velocity, temp: 0.000021 0.00(Removed)
check COM velocity, temp: NaN NaN(Removed)
wrapping first mol.: NaN NaN NaN
wrapping first mol.: NaN NaN NaN

 NSTEP = 350000 TIME(PS) = 6480.000 TEMP(K) = NaN PRESS = 0.0
 Etot = NaN EKtot = NaN EPtot = -124095.8889
 BOND = 0.0000 ANGLE = 955000.0368 DIHED = 0.0000
 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS = -1374.3325
 EELEC = -1077721.5932 EHBOND = 0.0000 RESTRAINT = 0.0000




--
William Sinko
Biomedical Sciences Graduate Student
Professor J. Andrew McCammon Group
Howard Hughes Medical Institute
University of California, San Diego
9500 Gilman Drive
La Jolla, California 92093




_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Nov 30 2011 - 16:00:02 PST
Custom Search