Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Jason Swails <jason.swails.gmail.com>
Date: Thu, 30 May 2013 15:13:29 -0400

Just a reminder to everyone based on what Ross said: there is a pending
patch to pmemd.cuda that will be coming out shortly (maybe even within
hours). It's entirely possible that several of these errors are fixed by
this patch.

All the best,
Jason


On Thu, May 30, 2013 at 2:46 PM, filip fratev <filipfratev.yahoo.com> wrote:

> I have observed the same crashes from time to time. I will run cellulose
> nve for 100k and will past results here.
>
> All the best,
> Filip
>
>
>
>
> ________________________________
> From: Scott Le Grand <varelse2005.gmail.com>
> To: AMBER Mailing List <amber.ambermd.org>
> Sent: Thursday, May 30, 2013 9:01 PM
> Subject: Re: [AMBER] experiences with EVGA GTX TITAN Superclocked -
> memtestG80 - UNDERclocking in Linux ?
>
>
> Run cellulose nve for 100k iterations twice . If the final energies don't
> match, you have a hardware issue. No need to play with ntpr or any other
> variable.
> On May 30, 2013 10:58 AM, <pavel.banas.upol.cz> wrote:
>
> >
> > Dear all,
> >
> > I would also like to share one of my experience with titan cards. We have
> > one gtx titan card and with one system (~55k atoms, NVT, RNA+waters) we
> run
> > into same troubles you are describing. I was also playing with ntpr to
> > figure out what is going on, step by step. I understand that the code is
> > using different routines for calculation energies+forces or only forces.
> > The
> > simulations of other systems are perfectly stable, running for days and
> > weeks. Only that particular system systematically ends up with this
> error.
> >
> > However, there was one interesting issue. When I set ntpr=1, the error
> > vanished (systematically in multiple runs) and the simulation was able to
> > run for more than millions of steps (I was not let it running for weeks
> as
> > in the meantime I shifted that simulation to other card - need data, not
> > testing). All other setting of ntpr failed. As I read this discussion, I
> > tried to set ene_avg_sampling=1 with some high value of ntpr (I expected
> > that this will shift the code to permanently use the force+energies part
> of
> > the code, similarly to ntpr=1), but the error occurred again.
> >
> > I know it is not very conclusive for finding out what is happening, at
> > least
> > not for me. Do you have any idea, why ntpr=1 might help?
> >
> > best regards,
> >
> > Pavel
> >
> >
> >
> >
> >
> > --
> > Pavel Banáš
> > pavel.banas.upol.cz
> > Department of Physical Chemistry,
> > Palacky University Olomouc
> > Czech Republic
> >
> >
> >
> > ---------- Původní zpráva ----------
> > Od: Jason Swails <jason.swails.gmail.com>
> > Datum: 29. 5. 2013
> > Předmět: Re: [AMBER] experiences with EVGA GTX TITAN Superclocked -
> > memtestG
> > 80 - UNDERclocking in Linux ?
> >
> > "I'll answer a little bit:
> >
> > NTPR=10 Etot after 2000 steps
> > >
> > > -443256.6711
> > > -443256.6711
> > >
> > > NTPR=200 Etot after 2000 steps
> > >
> > > -443261.0705
> > > -443261.0705
> > >
> > > Any idea why energies should depend on frequency of energy records
> (NTPR)
> > ?
> > >
> >
> > It is a subtle point, but the answer is 'different code paths.' In
> > general, it is NEVER necessary to compute the actual energy of a molecule
> > during the course of standard molecular dynamics (by analogy, it is NEVER
> > necessary to compute atomic forces during the course of random Monte
> Carlo
> > sampling).
> >
> > For performance's sake, then, pmemd.cuda computes only the force when
> > energies are not requested, leading to a different order of operations
> for
> > those runs. This difference ultimately causes divergence.
> >
> > To test this, try setting the variable ene_avg_sampling=10 in the &cntrl
> > section. This will force pmemd.cuda to compute energies every 10 steps
> > (for energy averaging), which will in turn make the followed code path
> > identical for any multiple-of-10 value of ntpr.
> >
> > --
> > Jason M. Swails
> > Quantum Theory Project,
> > University of Florida
> > Ph.D. Candidate
> > 352-392-4032
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber"
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>



-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu May 30 2013 - 12:30:02 PDT
Custom Search