Re: [AMBER] NaN and asterisks error in md.out, and mdinfo files

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 05 Aug 2014 10:55:44 -0700

Ok - thanks.

Please send me your prmtop, inpcrd file and mdin file and I will see if I
can replicate this.

All the best
Ross


On 8/5/14, 10:49 AM, "Hoshin Kim" <85hskim.gmail.com> wrote:

>Firstly, please confirm that you are using AMBER 14 with all the latest
>patches. Look in your mdout file for the section that begins:
>--------------------- INFORMATION ----------------------
>| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>And paste in the reported version and date here.
>Unfortunately, we are still using AMBER 12 with the latest bug fixes.
> Version 12.3.1
> 08/07/2013
>
>Next try running your heating and initial equilibration on the CPU and
>then switch to the GPU and see if that helps.
>I got same error when I tried to do MD simulations using minimization and
>equilibratiion steps performed by CPUs.
>
>Finally confirm that if you use the exact same input with the exact same
>random seed (set ig explicitly) that the situation yields NANs at exactly
>the same point. This is critical and will establish that it is a bug in
>the code and not a misbehaving GPU.
>
> When I did MD using exact same random seed(irest=1 to 0, ntx=5 to 1,
>ig=-1
>to 71277), error occurred at exact same time step.
>Here are the mdinfo file right before and after error occurred.
>
> NSTEP = 116500 TIME(PS) = 273.000 TEMP(K) = 299.88 PRESS =
>0.0
> Etot = -290241.5794 EKtot = 73194.3125 EPtot =
>-363435.8919
> BOND = 4886.6630 ANGLE = 288.5639 DIHED =
>357.0713
> 1-4 NB = -290.2780 1-4 EEL = 207.0313 VDWAALS =
>52299.7656
> EELEC = -421184.7088 EHBOND = 0.0000 RESTRAINT =
>0.0000
>
>--------------------------------------------------------------------------
>----
>
>check COM velocity, temp: 0.000000 0.00(Removed)
>
>
> NSTEP = 117000 TIME(PS) = 274.000 TEMP(K) = 299.95 PRESS =
>0.0
> Etot = -290251.4099 EKtot = 73211.3984 EPtot =
>-363462.8083
> BOND = 4843.7191 ANGLE = 306.5830 DIHED =
>344.0662
> 1-4 NB = -282.1788 1-4 EEL = 209.5822 VDWAALS =
>51954.7377
> EELEC = -420839.3177 EHBOND = 0.0000 RESTRAINT =
>0.0000
>
>--------------------------------------------------------------------------
>----
>
>wrapping first mol.: NaN NaN NaN
>wrapping first mol.: NaN NaN NaN
>
>
>NSTEP = 117500 TIME(PS) = 275.000 TEMP(K) = NaN PRESS =
>0.0
> Etot = NaN EKtot = NaN EPtot =
>**************
> BOND = 0.0000 ANGLE = 70230.8688 DIHED =
>0.0000
> 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
>-1006.2814
> EELEC = ************** EHBOND = 0.0000 RESTRAINT =
>0.0000
>
>--------------------------------------------------------------------------
>----
>check COM velocity, temp: NaN NaN(Removed)
>wrapping first mol.: NaN NaN NaN
>wrapping first mol.: NaN NaN NaN
>
>Regards,
>
>Hoshin Kim
>
>
>On Mon, Aug 4, 2014 at 3:14 PM, Ross Walker <ross.rosswalker.co.uk> wrote:
>
>> Hi Hoshin,
>>
>> You are probably hitting some assumption we made in the GPU code.
>> Certainly I've never tried doing simulations of Gold surfaces with it
>>and
>> that is way outside the scope of what most people would do.
>>
>> Firstly, please confirm that you are using AMBER 14 with all the latest
>> patches. Look in your mdout file for the section that begins:
>> --------------------- INFORMATION ----------------------
>> | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
>>
>> And paste in the reported version and date here.
>>
>> Next try running your heating and initial equilibration on the CPU and
>> then switch to the GPU and see if that helps.
>>
>> Finally confirm that if you use the exact same input with the exact same
>> random seed (set ig explicitly) that the situation yields NANs at
>>exactly
>> the same point. This is critical and will establish that it is a bug in
>> the code and not a misbehaving GPU.
>>
>> Once you have done this and have a reproducible test case that shows
>>this
>> behavior please post it here and we can try to figure out what the
>>problem
>> is.
>>
>> All the best
>> Ross
>>
>>
>> On 8/4/14, 12:04 PM, "Hoshin Kim" <85hskim.gmail.com> wrote:
>>
>> >Dear all,
>> >
>> >I am doing MD simulations of DNA grafted on Au surface. For
>>simulations,
>> >Amber GPU computing system are being used (Exxact, GTX 780)
>> >
>> >Now I am having a hard time performing MD simulations because of
>>following
>> >error:
>> >
>> >When I do MD simulation (I've tried both NVT, and NPT conditions), all
>> >information in md.restrt and some terms in mdinfo (md.out) suddenly
>>turned
>> >into NaN, and NaN with asterisks, respectively, at random time step.
>> >To figure this problem, I took a restrt file right before error
>>occurred,
>> >and reran MD. It worked fine first time, but identical error occurred
>> >again
>> >at random time step.
>> >
>> >Here is an example of mdinfo file
>> > NSTEP = 49999500 TIME(PS) = 100039.000 TEMP(K) = NaN PRESS =
>> >0.0
>> > Etot = NaN EKtot = NaN EPtot =
>> >**************
>> > BOND = 0.0000 ANGLE = 70230.8688 DIHED =
>> >0.0000
>> > 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
>> >**************
>> > EELEC = ************** EHBOND = 0.0000 RESTRAINT =
>> >0.0000
>> >
>>
>>>------------------------------------------------------------------------
>>>--
>> >----
>> >
>> >Interestingly, no errors were observed when I did same MD simulations
>> >using
>> >CPU, instead of GPU. Plus, For more simple conditions (e.g. just DNA in
>> >water) using same input parameters for minimization, heating, and
>> >production run, I had no problems using GPU.
>> >
>> >Regards,
>> >
>> >Hoshin
>> >_______________________________________________
>> >AMBER mailing list
>> >AMBER.ambermd.org
>> >http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Aug 05 2014 - 11:00:02 PDT
Custom Search