Re: [AMBER] NaN and asterisks error in md.out, and mdinfo files

From: Hoshin Kim <85hskim.gmail.com>
Date: Tue, 5 Aug 2014 14:08:27 -0400

Dear Dr. Walker,

Since size of prmtop file is huge (55mb), I can't send it through an
e-mail. I would appreciate it a lot if you let me know the proper way to
send these files to you.

Also, thanks in advance for sparing your precious time for me.

Regards,

Hoshin


On Tue, Aug 5, 2014 at 1:55 PM, Ross Walker <ross.rosswalker.co.uk> wrote:

> Ok - thanks.
>
> Please send me your prmtop, inpcrd file and mdin file and I will see if I
> can replicate this.
>
> All the best
> Ross
>
>
> On 8/5/14, 10:49 AM, "Hoshin Kim" <85hskim.gmail.com> wrote:
>
> >Firstly, please confirm that you are using AMBER 14 with all the latest
> >patches. Look in your mdout file for the section that begins:
> >--------------------- INFORMATION ----------------------
> >| GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >And paste in the reported version and date here.
> >Unfortunately, we are still using AMBER 12 with the latest bug fixes.
> > Version 12.3.1
> > 08/07/2013
> >
> >Next try running your heating and initial equilibration on the CPU and
> >then switch to the GPU and see if that helps.
> >I got same error when I tried to do MD simulations using minimization and
> >equilibratiion steps performed by CPUs.
> >
> >Finally confirm that if you use the exact same input with the exact same
> >random seed (set ig explicitly) that the situation yields NANs at exactly
> >the same point. This is critical and will establish that it is a bug in
> >the code and not a misbehaving GPU.
> >
> > When I did MD using exact same random seed(irest=1 to 0, ntx=5 to 1,
> >ig=-1
> >to 71277), error occurred at exact same time step.
> >Here are the mdinfo file right before and after error occurred.
> >
> > NSTEP = 116500 TIME(PS) = 273.000 TEMP(K) = 299.88 PRESS =
> >0.0
> > Etot = -290241.5794 EKtot = 73194.3125 EPtot =
> >-363435.8919
> > BOND = 4886.6630 ANGLE = 288.5639 DIHED =
> >357.0713
> > 1-4 NB = -290.2780 1-4 EEL = 207.0313 VDWAALS =
> >52299.7656
> > EELEC = -421184.7088 EHBOND = 0.0000 RESTRAINT =
> >0.0000
> >
> >--------------------------------------------------------------------------
> >----
> >
> >check COM velocity, temp: 0.000000 0.00(Removed)
> >
> >
> > NSTEP = 117000 TIME(PS) = 274.000 TEMP(K) = 299.95 PRESS =
> >0.0
> > Etot = -290251.4099 EKtot = 73211.3984 EPtot =
> >-363462.8083
> > BOND = 4843.7191 ANGLE = 306.5830 DIHED =
> >344.0662
> > 1-4 NB = -282.1788 1-4 EEL = 209.5822 VDWAALS =
> >51954.7377
> > EELEC = -420839.3177 EHBOND = 0.0000 RESTRAINT =
> >0.0000
> >
> >--------------------------------------------------------------------------
> >----
> >
> >wrapping first mol.: NaN NaN NaN
> >wrapping first mol.: NaN NaN NaN
> >
> >
> >NSTEP = 117500 TIME(PS) = 275.000 TEMP(K) = NaN PRESS =
> >0.0
> > Etot = NaN EKtot = NaN EPtot =
> >**************
> > BOND = 0.0000 ANGLE = 70230.8688 DIHED =
> >0.0000
> > 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
> >-1006.2814
> > EELEC = ************** EHBOND = 0.0000 RESTRAINT =
> >0.0000
> >
> >--------------------------------------------------------------------------
> >----
> >check COM velocity, temp: NaN NaN(Removed)
> >wrapping first mol.: NaN NaN NaN
> >wrapping first mol.: NaN NaN NaN
> >
> >Regards,
> >
> >Hoshin Kim
> >
> >
> >On Mon, Aug 4, 2014 at 3:14 PM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
> >
> >> Hi Hoshin,
> >>
> >> You are probably hitting some assumption we made in the GPU code.
> >> Certainly I've never tried doing simulations of Gold surfaces with it
> >>and
> >> that is way outside the scope of what most people would do.
> >>
> >> Firstly, please confirm that you are using AMBER 14 with all the latest
> >> patches. Look in your mdout file for the section that begins:
> >> --------------------- INFORMATION ----------------------
> >> | GPU (CUDA) Version of PMEMD in use: NVIDIA GPU IN USE.
> >>
> >> And paste in the reported version and date here.
> >>
> >> Next try running your heating and initial equilibration on the CPU and
> >> then switch to the GPU and see if that helps.
> >>
> >> Finally confirm that if you use the exact same input with the exact same
> >> random seed (set ig explicitly) that the situation yields NANs at
> >>exactly
> >> the same point. This is critical and will establish that it is a bug in
> >> the code and not a misbehaving GPU.
> >>
> >> Once you have done this and have a reproducible test case that shows
> >>this
> >> behavior please post it here and we can try to figure out what the
> >>problem
> >> is.
> >>
> >> All the best
> >> Ross
> >>
> >>
> >> On 8/4/14, 12:04 PM, "Hoshin Kim" <85hskim.gmail.com> wrote:
> >>
> >> >Dear all,
> >> >
> >> >I am doing MD simulations of DNA grafted on Au surface. For
> >>simulations,
> >> >Amber GPU computing system are being used (Exxact, GTX 780)
> >> >
> >> >Now I am having a hard time performing MD simulations because of
> >>following
> >> >error:
> >> >
> >> >When I do MD simulation (I've tried both NVT, and NPT conditions), all
> >> >information in md.restrt and some terms in mdinfo (md.out) suddenly
> >>turned
> >> >into NaN, and NaN with asterisks, respectively, at random time step.
> >> >To figure this problem, I took a restrt file right before error
> >>occurred,
> >> >and reran MD. It worked fine first time, but identical error occurred
> >> >again
> >> >at random time step.
> >> >
> >> >Here is an example of mdinfo file
> >> > NSTEP = 49999500 TIME(PS) = 100039.000 TEMP(K) = NaN PRESS =
> >> >0.0
> >> > Etot = NaN EKtot = NaN EPtot =
> >> >**************
> >> > BOND = 0.0000 ANGLE = 70230.8688 DIHED =
> >> >0.0000
> >> > 1-4 NB = 0.0000 1-4 EEL = 0.0000 VDWAALS =
> >> >**************
> >> > EELEC = ************** EHBOND = 0.0000 RESTRAINT =
> >> >0.0000
> >> >
> >>
> >>>------------------------------------------------------------------------
> >>>--
> >> >----
> >> >
> >> >Interestingly, no errors were observed when I did same MD simulations
> >> >using
> >> >CPU, instead of GPU. Plus, For more simple conditions (e.g. just DNA in
> >> >water) using same input parameters for minimization, heating, and
> >> >production run, I had no problems using GPU.
> >> >
> >> >Regards,
> >> >
> >> >Hoshin
> >> >_______________________________________________
> >> >AMBER mailing list
> >> >AMBER.ambermd.org
> >> >http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >>
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >_______________________________________________
> >AMBER mailing list
> >AMBER.ambermd.org
> >http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Aug 05 2014 - 11:30:02 PDT
Custom Search