Re: [AMBER] GB simulation on GPU freezes

From: Scott Le Grand <varelse2005.gmail.com>
Date: Fri, 14 Oct 2011 15:29:39 -0700

If it doesn't lock up on the 2070, but does on the 480, it is likely
defective HW.

If it locks up on the 2070, and Ross can repro it on his 20xxs, I know what
I'll be doing this weekend :-)...

But shooting from the hip, I'm guessing this is a bad GPU.


On Fri, Oct 14, 2011 at 2:55 PM, Ross Walker <rosscwalker.gmail.com> wrote:

> Can you send me the input files for one of the simulations that locks
> please so I can try to reproduce it.
>
> Does it lock up on both the GTX480 and C2070?
>
> All the best
> Ross
>
>
>
> On Oct 14, 2011, at 15:28, "E. Nihal Korkmaz" <enihalkorkmaz.gmail.com>
> wrote:
>
> > Yes, I applied the bugfix patches during the first configuration of Amber
> on
> > the cluster as directed on the Amber website.
> >
> > Not the exact same point, but always after 500 ns for that particular
> > simulation.
> > I just realized it got locked up for different proteins (shorter) too at
> > around 200 ns. I simulate a series of the same protein for different
> > conditions (T and salt conc), some goes smoothly some gets locked up. I
> > checked the energy logs in the *.out file, nothing seems unusual and
> nothing
> > is drastically different between simulations go smooth and those freeze.
> >
> > Thanks,
> > Nihal
> >
> > On Fri, Oct 14, 2011 at 2:15 PM, Ross Walker <rosscwalker.gmail.com>
> wrote:
> >
> >> There are a lot of unnecessary defaults in your input file. Like
> specifying
> >> taup for a GB run. You can probably also set ntwr much larger to improve
> >> performance. And a gamma_ln of 20 is probably a bit high. None of these
> >> should cause a lockup though.
> >>
> >> Can you confirm that you are running with the latest bugfixes. In
> >> particular bugfix.17 for Amber 11.
> >>
> >> Also does the calculation always lockup at the exact same point?
> >>
> >> All the best
> >> Ross
> >>
> >>
> >>
> >> On Oct 14, 2011, at 14:17, "E. Nihal Korkmaz" <enihalkorkmaz.gmail.com>
> >> wrote:
> >>
> >>> Amber 11, I tried on GeForce GTX 480 and Tesla C2070 processors, on
> Linux
> >>> (CentOS release 5.6). We have Cuda 4 for nvidia compiler. I am running
> >> with
> >>> pmemd.cuda.
> >>>
> >>> and that's my in file below (although same file works ok with the
> >> homologous
> >>> structure) :
> >>> &cntrl
> >>> imin=0,
> >>>
> >>> ntb=0,
> >>> ntx=5,
> >>> irest=1,
> >>>
> >>> ntpr=200,
> >>> ntwr=200,
> >>> ntwx=200,
> >>> ntwe=200,
> >>>
> >>> nstlim=5000000,
> >>> dt=0.002,
> >>>
> >>> ntt=3,
> >>>
> >>> temp0=300,
> >>> tempi=300,
> >>> ig=-1,
> >>> tautp=1,
> >>> gamma_ln=20,
> >>>
> >>> ntp=0,
> >>> pres0=1,
> >>> taup=1,
> >>>
> >>> ntc=2,
> >>> tol=0.00001,
> >>>
> >>> ntf=2,
> >>> ntb=0,
> >>> dielc=1,
> >>> cut=9999,
> >>> rgbmax=12,
> >>> ipol=0,
> >>> ifqnt=0,
> >>> igb=5,
> >>> saltcon=0.15,
> >>> ioutfm=1,
> >>> nscm=100,
> >>> &end
> >>>
> >>>
> >>> On Fri, Oct 14, 2011 at 1:05 PM, Scott Le Grand <varelse2005.gmail.com
> >>> wrote:
> >>>
> >>>> What revision of AMBER? What GPU? What OS? What driver? What
> toolkit
> >> did
> >>>> you compile with?
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Oct 14, 2011 at 10:55 AM, E. Nihal Korkmaz
> >>>> <enihalkorkmaz.gmail.com>wrote:
> >>>>
> >>>>> Dear all,
> >>>>>
> >>>>> I keep having a problem that only for a particular protein the
> >>>> simulation
> >>>>> "freezes" and by freeze I mean, it looks like the job is running but
> no
> >>>>> changes are made on the output files even if you wait 2 days. I am
> >> using
> >>>>> igb=5 on GPU, it is a 114 amino acid long protein, I have the
> >> homologous
> >>>>> structure running (112 amino acid long) without a problem. But that
> >>>>> specific
> >>>>> one stops without being dropped of the queue or any error messages at
> >>>> all.
> >>>>> I
> >>>>> checked the output files, no '*' or 'NaN' are present. I also tried
> >>>> running
> >>>>> on different machines, same thing happens. I tried starting from a
> >>>>> different
> >>>>> restart file, nothing changes. I always freezes although at different
> >>>> time
> >>>>> steps.
> >>>>>
> >>>>> Has anyone have such a problem before? What can be the causes? I'd
> >>>>> appreciate any comments or suggestions.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> --
> >>>>> Elif Nihal Korkmaz
> >>>>>
> >>>>> Research Assistant
> >>>>> University of Wisconsin - Biophysics
> >>>>> Member of Qiang Cui & Thomas Record Labs
> >>>>> 1101 University Ave, Rm. 8359
> >>>>> Madison, WI 53706
> >>>>> Phone: 608-265-3644
> >>>>> Email: korkmaz.wisc.edu
> >>>>> _______________________________________________
> >>>>> AMBER mailing list
> >>>>> AMBER.ambermd.org
> >>>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>>
> >>>> _______________________________________________
> >>>> AMBER mailing list
> >>>> AMBER.ambermd.org
> >>>> http://lists.ambermd.org/mailman/listinfo/amber
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Elif Nihal Korkmaz
> >>>
> >>> Research Assistant
> >>> University of Wisconsin - Biophysics
> >>> Member of Qiang Cui & Thomas Record Labs
> >>> 1101 University Ave, Rm. 8359
> >>> Madison, WI 53706
> >>> Phone: 608-265-3644
> >>> Email: korkmaz.wisc.edu
> >>> _______________________________________________
> >>> AMBER mailing list
> >>> AMBER.ambermd.org
> >>> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >>
> >
> >
> >
> > --
> > Elif Nihal Korkmaz
> >
> > Research Assistant
> > University of Wisconsin - Biophysics
> > Member of Qiang Cui & Thomas Record Labs
> > 1101 University Ave, Rm. 8359
> > Madison, WI 53706
> > Phone: 608-265-3644
> > Email: korkmaz.wisc.edu
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 14 2011 - 16:00:02 PDT
Custom Search