Hi all,
As a matter of fact, even with those bug fixes i observed a very
similar problem. At some point amber11 (fresh installation with all
bug fixes) produced NaN s in restart file. There is in fact a work
around with our GTX 480 card. Method is simply like that; divide the
simulation into smaller time scales and run those smaller simulations
consecutively. Also wait for at least 10 mins for cooling down the
card to its normal temperature. I know this is very weird but it
worked for us. I just wanted to let all people, who has similar
problems, know.
best
peker milas
On Thu, Jan 20, 2011 at 7:08 PM, Bongkeun Kim <bkim.chem.ucsb.edu> wrote:
> Hello,
>
> I'm compiling amber 11 with the recent bugfix 12 from the clean source.
> Maybe a day or two, I will see the error is occurring or not.
> By the way, this is the only error from pmemd.cuda and pmemd.cuda.mpi.
> Thank you.
> Bongkeun Kim
>
> Quoting Jason Swails <jason.swails.gmail.com>:
>
>> Hello,
>>
>> While Ross knows this code probably much better than I do, I think he missed
>> something small (but seriously important in this case) regarding your email.
>>
>> The amber11's bugfixes no longer have coincidentally matching bugfixes.
>> That is to say, the Amber11 bug fixes now go up to 12 (you say you applied
>> up to 11).
>>
>> The 12th bugfix addresses these issues when you use a cutoff value > 8
>> (which you are; yours is 10).
>>
>> Apply bugfix 12 and all should be well.
>>
>> Good luck!
>> Jason
>>
>> On Thu, Jan 20, 2011 at 4:14 PM, Bongkeun Kim <bkim.chem.ucsb.edu> wrote:
>>
>>> Hello,
>>>
>>> I got NaN error when I ran pmemd.cuda and pmemd.cuda.mpi about after 50ns.
>>> The log file is like:
>>>
>>> NSTEP = 1465000 TIME(PS) = 52980.000 TEMP(K) = 358.79 PRESS
>>> = 71.4
>>> Etot = -62655.3195 EKtot = 27682.3184 EPtot =
>>> -90337.6379
>>> BOND = 2126.8615 ANGLE = 1531.3712 DIHED =
>>> 1681.7735
>>> 1-4 NB = 8574.2946 1-4 EEL = 1833.2170 VDWAALS =
>>> 8865.3186
>>> EELEC = -114950.4742 EHBOND = 0.0000 RESTRAINT =
>>> 0.0000
>>> EKCMT = 12293.6612 VIRIAL = 11676.7751 VOLUME =
>>> 399930.2222
>>> Density =
>>> 0.9998
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> wrapping first mol.: -31.3208124120934 0.00000000000000
>>> 0.00000000000000
>>> wrapping first mol.: -31.3208124120934 0.00000000000000
>>> 0.00000000000000
>>>
>>> NSTEP = 1470000 TIME(PS) = 52990.000 TEMP(K) = 362.41 PRESS
>>> = 48.4
>>> Etot = -62667.6518 EKtot = 27961.6172 EPtot =
>>> -90629.2690
>>> BOND = 2136.8358 ANGLE = 1550.7648 DIHED =
>>> 1682.5454
>>> 1-4 NB = 8527.4693 1-4 EEL = 1853.5058 VDWAALS =
>>> 8696.1619
>>> EELEC = -115076.5520 EHBOND = 0.0000 RESTRAINT =
>>> 0.0000
>>> EKCMT = 12447.5954 VIRIAL = 12029.4233 VOLUME =
>>> 400265.4168
>>> Density =
>>> 0.9990
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> wrapping first mol.: NaN NaN
>>> NaN
>>> wrapping first mol.: NaN NaN
>>> NaN
>>>
>>> NSTEP = 1475000 TIME(PS) = 53000.000 TEMP(K) = NaN PRESS
>>> = NaN
>>> Etot = NaN EKtot = NaN EPtot =
>>> NaN
>>> BOND = ************** ANGLE = 585786.5880 DIHED =
>>> 0.0000
>>> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
>>> -662.1176
>>> EELEC = NaN EHBOND = 0.0000 RESTRAINT =
>>> 0.0000
>>> EKCMT = 0.0000 VIRIAL = NaN VOLUME =
>>> NaN
>>> Density =
>>> NaN
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>> It was really strange. I set up T=325K and this was well maintained in
>>> the beginning but at certain point this temperature was growing up and
>>> finally I got NaN error. When I checked the last rst file before NaN
>>> error, there is no coordinates and velocities for water molecules and
>>> the box size is bigger than the one in the beginning.
>>> +++++++++++++++++++++++++++++++++++++++
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
>>> 31.3730000 80.7640000 158.3730000 90.0000000 90.0000000 90.0000000
>>> +++++++++++++++++++++++++++++++++++++++++
>>>
>>> This is the last part of the rst file from the previous run.
>>> ++++++++++++++++++++++++++++
>>> 0.2813319 0.2859586 0.1069026 -0.2630481 0.7645880 0.1471529
>>> -0.8100536 1.2586927 0.1523881 0.2990605 0.1620192 0.0976196
>>> -0.0732898 1.1917989 -1.0429825 0.2014995 0.3834629 -0.1202106
>>> 0.0276703 -0.2488241 -0.2628807 -0.2085400 0.4762971 0.4179272
>>> -0.3814862 -0.2374063 -0.2416039 0.0699310 -0.0610051 -0.1580978
>>> 0.9372542 1.0430179 -0.7452719 0.3271696 -0.9559725 -0.3386399
>>> 0.2260832 0.0151047 0.1283436 1.2348834 -1.0930565 0.2119684
>>> -0.7740772 0.0938291 0.2359591 0.2605087 0.0407511 -0.3941893
>>> 2.2260764 -0.6258161 0.5861404 -0.4234042 0.2330984 -0.6828126
>>> 85.0975010 80.6688215 55.6648514 90.0000000 90.0000000 90.0000000
>>> +++++++++++++++++++++++++++++++
>>>
>>> My input file is this:
>>> ++++++++++++++++++++++++
>>> &cntrl
>>> imin = 0, irest = 1, ntx = 5,
>>> ntb = 2, pres0 = 1.0, ntp = 2,
>>> taup = 2.0, iwrap=1,
>>> cut = 10.0, ntr = 0,
>>> ntc = 2, ntf = 2,
>>> tempi = 325.0, temp0 = 325.0,
>>> ntt = 3, gamma_ln = 1.0,
>>> nstlim = 5000000, dt = 0.002,
>>> ntpr = 5000, ntwx = 5000, ntwr = 5000
>>> /
>>> +++++++++++++++++++++++++
>>>
>>> And I use amber 11 with bugfix 11.
>>> Please let me know any idea that helps me to avoid this problem.
>>> Thank you.
>>> Bongkeun Kim
>>> bkim.chem.ucsb.edu
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>
>>
>>
>>
>> --
>> Jason M. Swails
>> Quantum Theory Project,
>> University of Florida
>> Ph.D. Graduate Student
>> 352-392-4032
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 20 2011 - 20:30:04 PST