Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Scott Le Grand <varelse2005.gmail.com>
Date: Thu, 30 May 2013 11:01:58 -0700

Run cellulose nve for 100k iterations twice . If the final energies don't
match, you have a hardware issue. No need to play with ntpr or any other
variable.
On May 30, 2013 10:58 AM, <pavel.banas.upol.cz> wrote:

>
> Dear all,
>
> I would also like to share one of my experience with titan cards. We have
> one gtx titan card and with one system (~55k atoms, NVT, RNA+waters) we run
> into same troubles you are describing. I was also playing with ntpr to
> figure out what is going on, step by step. I understand that the code is
> using different routines for calculation energies+forces or only forces.
> The
> simulations of other systems are perfectly stable, running for days and
> weeks. Only that particular system systematically ends up with this error.
>
> However, there was one interesting issue. When I set ntpr=1, the error
> vanished (systematically in multiple runs) and the simulation was able to
> run for more than millions of steps (I was not let it running for weeks as
> in the meantime I shifted that simulation to other card - need data, not
> testing). All other setting of ntpr failed. As I read this discussion, I
> tried to set ene_avg_sampling=1 with some high value of ntpr (I expected
> that this will shift the code to permanently use the force+energies part of
> the code, similarly to ntpr=1), but the error occurred again.
>
> I know it is not very conclusive for finding out what is happening, at
> least
> not for me. Do you have any idea, why ntpr=1 might help?
>
> best regards,
>
> Pavel
>
>
>
>
>
> --
> Pavel Banáš
> pavel.banas.upol.cz
> Department of Physical Chemistry,
> Palacky University Olomouc
> Czech Republic
>
>
>
> ---------- Původní zpráva ----------
> Od: Jason Swails <jason.swails.gmail.com>
> Datum: 29. 5. 2013
> Předmět: Re: [AMBER] experiences with EVGA GTX TITAN Superclocked -
> memtestG
> 80 - UNDERclocking in Linux ?
>
> "I'll answer a little bit:
>
> NTPR=10 Etot after 2000 steps
> >
> > -443256.6711
> > -443256.6711
> >
> > NTPR=200 Etot after 2000 steps
> >
> > -443261.0705
> > -443261.0705
> >
> > Any idea why energies should depend on frequency of energy records (NTPR)
> ?
> >
>
> It is a subtle point, but the answer is 'different code paths.' In
> general, it is NEVER necessary to compute the actual energy of a molecule
> during the course of standard molecular dynamics (by analogy, it is NEVER
> necessary to compute atomic forces during the course of random Monte Carlo
> sampling).
>
> For performance's sake, then, pmemd.cuda computes only the force when
> energies are not requested, leading to a different order of operations for
> those runs. This difference ultimately causes divergence.
>
> To test this, try setting the variable ene_avg_sampling=10 in the &cntrl
> section. This will force pmemd.cuda to compute energies every 10 steps
> (for energy averaging), which will in turn make the followed code path
> identical for any multiple-of-10 value of ntpr.
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Candidate
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber"
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu May 30 2013 - 11:30:02 PDT
Custom Search