Re: [AMBER] experiences with EVGA GTX TITAN Superclocked - memtestG80 - UNDERclocking in Linux ?

From: Scott Le Grand <varelse2005.gmail.com>
Date: Sat, 1 Jun 2013 12:16:29 -0700

All force accumulation is done with 64-bit fixed point integers so their
summation is utterly order-independent - all roundoff happens in a
deterministic manner at the point of type conversion from single-precision
to 64-bit int. Therefore, each simulation with the same starting
conditions on the same hardware will follow the exact same chaotic
trajectory - like watching the same movie over and over again - no two
movies are alike, but watching the same movie twice better be.

If it's not, it's some sort of bug... I did this both because I'm of the
belief that reproducibility of experimental results is really important and
because it's handy for finding SW and HW bugs when the appearance of
nondeterministic divergent trajectories is a 100% indicator that something
went wrong

The only caveat is that in old versions of the program, energy summation
was done partially with doubles in an unpredictable order. This then
causes transient differences in the last sig-fig but the trajectory was
still identical. This should be fix in current code.

The only thing that's funny about these tests is how little they diverge.
So I am *hoping* this might be a bug in cuFFT rather than a GTX Titan HW
issue. This is one explanation that would explain why GB simulations are
deterministic and PME simulations aren't.








On Sat, Jun 1, 2013 at 12:04 PM, Jan-Philip Gehrcke <jgehrcke.googlemail.com
> wrote:

> On 06/01/2013 08:48 PM, Scott Le Grand wrote:
> > "Also am I right in thinking (from what Scott was saying) that all the
> > benchmarks should be reproducible across 50k steps but begin to diverge
> at
> > around 100K steps? Is there any difference from in setting *ig *to an
> > explicit number to removing it from the mdin file?"
> >
> > They should *never* diverge when running the same code on the same GPU
> > configuration on the same machine unless they use a different random
> seed...
> >
>
> No divergence after N time steps even for large N? How should that be
> possible for a chaotic system? Are round-off errors deterministic then?
> And if so, from which experience does the crucial limit of 100k steps
> come which you are mentioning throughout this mailing list thread?
>
> Thanks for clarifying,
>
> Jan-Philip
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Jun 01 2013 - 12:30:03 PDT
Custom Search