Re: [AMBER] NaN error on traj and output with AMBER CUDA

From: Jason Swails <jason.swails.gmail.com>
Date: Thu, 20 Jan 2011 20:46:06 -0500

On Thu, Jan 20, 2011 at 8:44 PM, Jason Swails <jason.swails.gmail.com>wrote:

> Hello,
>
> While Ross knows this code probably much better than I do, I think he
> missed something small (but seriously important in this case) regarding your
> email.
>
> The amber11's bugfixes no longer have coincidentally matching bugfixes.
> That is to say, the Amber11 bug fixes


The awful wording of this statement was botched -- Amber11 no longer has
coincidentally matching bugfixes. It serves no purpose but to confuse
people. Take-home lesson: apply bugfix.12 (if you're getting problems,
start with a fresh tree and apply all bug fixes to a fresh checkout).

Don't forget to recompile!
--Jason

now go up to 12 (you say you applied up to 11).
>
> The 12th bugfix addresses these issues when you use a cutoff value > 8
> (which you are; yours is 10).
>
> Apply bugfix 12 and all should be well.
>
> Good luck!
> Jason
>
>
> On Thu, Jan 20, 2011 at 4:14 PM, Bongkeun Kim <bkim.chem.ucsb.edu> wrote:
>
>> Hello,
>>
>> I got NaN error when I ran pmemd.cuda and pmemd.cuda.mpi about after 50ns.
>> The log file is like:
>>
>> NSTEP = 1465000 TIME(PS) = 52980.000 TEMP(K) = 358.79 PRESS
>> = 71.4
>> Etot = -62655.3195 EKtot = 27682.3184 EPtot =
>> -90337.6379
>> BOND = 2126.8615 ANGLE = 1531.3712 DIHED =
>> 1681.7735
>> 1-4 NB = 8574.2946 1-4 EEL = 1833.2170 VDWAALS =
>> 8865.3186
>> EELEC = -114950.4742 EHBOND = 0.0000 RESTRAINT =
>> 0.0000
>> EKCMT = 12293.6612 VIRIAL = 11676.7751 VOLUME =
>> 399930.2222
>> Density =
>> 0.9998
>>
>>
>> ------------------------------------------------------------------------------
>>
>> wrapping first mol.: -31.3208124120934 0.00000000000000
>> 0.00000000000000
>> wrapping first mol.: -31.3208124120934 0.00000000000000
>> 0.00000000000000
>>
>> NSTEP = 1470000 TIME(PS) = 52990.000 TEMP(K) = 362.41 PRESS
>> = 48.4
>> Etot = -62667.6518 EKtot = 27961.6172 EPtot =
>> -90629.2690
>> BOND = 2136.8358 ANGLE = 1550.7648 DIHED =
>> 1682.5454
>> 1-4 NB = 8527.4693 1-4 EEL = 1853.5058 VDWAALS =
>> 8696.1619
>> EELEC = -115076.5520 EHBOND = 0.0000 RESTRAINT =
>> 0.0000
>> EKCMT = 12447.5954 VIRIAL = 12029.4233 VOLUME =
>> 400265.4168
>> Density =
>> 0.9990
>>
>>
>> ------------------------------------------------------------------------------
>>
>> wrapping first mol.: NaN NaN
>> NaN
>> wrapping first mol.: NaN NaN
>> NaN
>>
>> NSTEP = 1475000 TIME(PS) = 53000.000 TEMP(K) = NaN PRESS
>> = NaN
>> Etot = NaN EKtot = NaN EPtot =
>> NaN
>> BOND = ************** ANGLE = 585786.5880 DIHED =
>> 0.0000
>> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
>> -662.1176
>> EELEC = NaN EHBOND = 0.0000 RESTRAINT =
>> 0.0000
>> EKCMT = 0.0000 VIRIAL = NaN VOLUME =
>> NaN
>> Density =
>> NaN
>>
>>
>> ------------------------------------------------------------------------------
>>
>>
>>
>> It was really strange. I set up T=325K and this was well maintained in
>> the beginning but at certain point this temperature was growing up and
>> finally I got NaN error. When I checked the last rst file before NaN
>> error, there is no coordinates and velocities for water molecules and
>> the box size is bigger than the one in the beginning.
>> +++++++++++++++++++++++++++++++++++++++
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>> 31.3730000 80.7640000 158.3730000 90.0000000 90.0000000 90.0000000
>> +++++++++++++++++++++++++++++++++++++++++
>>
>> This is the last part of the rst file from the previous run.
>> ++++++++++++++++++++++++++++
>> 0.2813319 0.2859586 0.1069026 -0.2630481 0.7645880 0.1471529
>> -0.8100536 1.2586927 0.1523881 0.2990605 0.1620192 0.0976196
>> -0.0732898 1.1917989 -1.0429825 0.2014995 0.3834629 -0.1202106
>> 0.0276703 -0.2488241 -0.2628807 -0.2085400 0.4762971 0.4179272
>> -0.3814862 -0.2374063 -0.2416039 0.0699310 -0.0610051 -0.1580978
>> 0.9372542 1.0430179 -0.7452719 0.3271696 -0.9559725 -0.3386399
>> 0.2260832 0.0151047 0.1283436 1.2348834 -1.0930565 0.2119684
>> -0.7740772 0.0938291 0.2359591 0.2605087 0.0407511 -0.3941893
>> 2.2260764 -0.6258161 0.5861404 -0.4234042 0.2330984 -0.6828126
>> 85.0975010 80.6688215 55.6648514 90.0000000 90.0000000 90.0000000
>> +++++++++++++++++++++++++++++++
>>
>> My input file is this:
>> ++++++++++++++++++++++++
>> &cntrl
>> imin = 0, irest = 1, ntx = 5,
>> ntb = 2, pres0 = 1.0, ntp = 2,
>> taup = 2.0, iwrap=1,
>> cut = 10.0, ntr = 0,
>> ntc = 2, ntf = 2,
>> tempi = 325.0, temp0 = 325.0,
>> ntt = 3, gamma_ln = 1.0,
>> nstlim = 5000000, dt = 0.002,
>> ntpr = 5000, ntwx = 5000, ntwr = 5000
>> /
>> +++++++++++++++++++++++++
>>
>> And I use amber 11 with bugfix 11.
>> Please let me know any idea that helps me to avoid this problem.
>> Thank you.
>> Bongkeun Kim
>> bkim.chem.ucsb.edu
>>
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Graduate Student
> 352-392-4032
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Graduate Student
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 20 2011 - 18:00:04 PST
Custom Search