[AMBER] R: CUDA NaN error occuring at "wrapping first mol"

From: Giovanni Pavan <giovanni.pavan.supsi.ch>
Date: Thu, 1 Dec 2011 10:20:08 +0100

Dear Bill,
lately (you can find the discussion in the mail archive) I have encountered
a similar problem with pmemd.cuda.
I also had restraints.
The problem was related to iwrap=1.
After removing that everything worked perfectly.
Bye
g

Dr. Giovanni Maria Pavan

SUPSI - Laboratory of Applied Mathematics and Physics (LaMFI)
Centro Galleria 2, Manno 6928, Switzerland.
e-mail: giovanni.pavan.supsi.ch
skype: giovanni_pavan
phone: +41 58 666 65 60

-----Messaggio originale-----
Da: Bill Sinko [mailto:wsinko.ucsd.edu]
Inviato: giovedì, 1. dicembre 2011 00:55
A: amber.ambermd.org
Oggetto: [AMBER] CUDA NaN error occuring at "wrapping first mol"

I have noticed an error occuring in pmemd.cuda at the first instance of the
words "wrapping first mol." in larger systems (~60,000 atoms) in the mdout
file. After this error the restart and coordinate file are filled with NaN.
This same system has been run out to 160ns with no error using pmemd.mpi. I
have run smaller systems 10,000 to 15,000 atoms out to 1 microsecond with
the same pmemd.cuda executable on the same computer with no error.
"wrapping first mol" is output numerous times in the smaller system but does
not cause problems.

I am running pmemd.cuda amber11 with the latest bugfixes up to 19,
cudatoolkit 4.0.17, I have 2 quad core Intel(R) Xeon(R) CPU X5472 .
3.00GHz. This error was seen using a GTX570, a GTX580 (3gb memory), and a
Tesla C2050 (running on a seperate computer with 2 quad core
Intel(R) Xeon(R) CPU W5580 . 3.20GHz)

The error is reproducible in that it occurs at the exact same time given the
same ig value and occurs as soon as the "wrapping first mol." occurs given a
random ig value. Turning iwrap off does not fix the problem and it will
occur at the same time point as the "wrapping first mol" occured with iwrap
on.

Below is the input file, and first instance of error in the output file. I
also attached the test log file from when I compiled. Any help is much
appreciated.


Thanks,

Bill


&cntrl

 timlim = 999999., nmropt = 0, imin = 0,
 ntx = 5, irest = 1, ntrx = 1, ntxo = 1,
 ntpr = 5000, ntwx = 5000, ntwv = 0, ntwe = 0,
 ioutfm = 1, ntwr = 5000,

 ntf = 2, ntb = 1,
 igb = 0,
 cut = 9, nsnb = 20,

 nstlim = 25000000, nscm = 2500, iwrap = 1,
 t = 0.0, dt = 0.002,

 temp0 = 300.0, tempi = 200.0, tautp=0.5, ig = -1,
 heat = 0.0, ntt = 1,

 ntc = 2, tol = 0.00001, jfastw = 0,

 ibelly=0, ntr=0,

&end


Everything is fine until here and the words "wrapping first mol." have not
occured yet here is the mdout when the error starts.


 NSTEP = 345000 TIME(PS) = 6470.000 TEMP(K) = 300.63 PRESS =
0.0
 Etot = -142135.0768 EKtot = 36637.2070 EPtot =
-178772.2838
 BOND = 1471.6496 ANGLE = 3908.0882 DIHED =
5198.8718
 1-4 NB = 1769.8981 1-4 EEL = 17988.7091 VDWAALS =
19870.0613
 EELEC = -228979.5620 EHBOND = 0.0000 RESTRAINT =
0.0000
 
----------------------------------------------------------------------------
--
check COM velocity, temp:        0.000021     0.00(Removed)
check COM velocity, temp:             NaN      NaN(Removed)
wrapping first mol.:            NaN            NaN            NaN
wrapping first mol.:            NaN            NaN            NaN
 NSTEP =   350000   TIME(PS) =    6480.000  TEMP(K) =      NaN  PRESS =
0.0
 Etot   =            NaN  EKtot   =            NaN  EPtot      =
-124095.8889
 BOND   =         0.0000  ANGLE   =    955000.0368  DIHED      =
0.0000
 1-4 NB =         0.0000  1-4 EEL =         0.0000  VDWAALS    =
-1374.3325
 EELEC  =  -1077721.5932  EHBOND  =         0.0000  RESTRAINT  =
0.0000
--
William Sinko
Biomedical Sciences Graduate Student
Professor J. Andrew McCammon Group
Howard Hughes Medical Institute
University of California, San Diego
9500 Gilman Drive
La Jolla, California 92093
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Dec 01 2011 - 01:30:04 PST
Custom Search