Re: [AMBER] GPU vs CPU test from Scott Le Grand on 2011-01-27 (Amber Archive Jan 2011)

From: Scott Le Grand <SLeGrand.nvidia.com>
Date: Thu, 27 Jan 2011 13:19:38 -0800

It's also a good way to detect race conditions :-)...

-----Original Message-----
From: Ross Walker [mailto:ross.rosswalker.co.uk]
Sent: Thursday, January 27, 2011 08:46
To: 'AMBER Mailing List'
Subject: Re: [AMBER] GPU vs CPU test

Hello,

Just to add some extra info to Jason's already excellent description of the
situation I should add that:

1) The GPU implementation uses a different random number generator to the
CPU version. Hence any simulation (such as a ntt=3 run) will diverge from
the CPU version immediately. The critical point is whether ensemble or
average behavior comes out the same. In statistical mechanics language do
the two simulations ultimately give you the same partition function?

2) While the CPU version of the code will give divergence for two identical
runs in parallel, e.g. both on 4 CPUs due to the load balancer the GPU code
is, at present, designed to be completely deterministic. That is if you run
the EXACT same simulation on the EXACT same hardware you should get the
EXACT same trajectory. In some ways this might be a good way to test if you
GPU is acting flakey with overclocking, overheating etc etc.

All the best
Ross

> -----Original Message-----
> From: Jason Swails [mailto:jason.swails.gmail.com]
> Sent: Thursday, January 27, 2011 8:14 AM
> To: AMBER Mailing List
> Subject: Re: [AMBER] GPU vs CPU test
>
> Hello,
>
> What you're seeing is not surprising. Protein systems are chaotic, such
> that even tiny changes in floating point values can cause divergent
> trajectories over very short periods of time. At the most basic level,
the
> fact that machine precision is not infinite will give rise to rounding
> errors sufficient to cause this.
>
> There is a lot more contributing to the divergence that you are seeing on
> top of the machine precision I already mentioned. First of all, the
default
> precision used by pmemd.cuda(.MPI) is a hybrid single precision/double
> precision (SPDP), that uses double precision for the more sensitive
> quantities that require it, yet single precision for everything else.
This
> will cause divergence almost immediately, since a real is much different
> than a double precision unless you happen to have a number that is
perfectly
> representable in binary out to the number of significant digits found in
> single precision reals (vanishingly rare for non-integers, I believe).
>
> To make this situation even worse (in terms of long-timescale
> reproducibility), the CPU version of pmemd uses dynamic load-balancing.
> That is to say, the load-balancer learns, and the workload is
redistributed
> periodically based on calculated workloads, which amplifies the rounding
> errors. To see a demonstration, try running your simulation with 2 CPUs,
4
> CPUs, and 8 CPUs (keeping all inputs, random seeds, etc. exactly the same)
> and you will see the trajectories diverge.
>
> I hope this helps clarify things. One thing I do want to note -- make
sure
> you've applied all Amber11 bug fixes (there are 12 of them), since this
has
> plenty of bug fixes.
>
> All the best,
> Jason
>
> On Thu, Jan 27, 2011 at 10:06 AM, Massimiliano Porrini
> <M.Porrini.ed.ac.uk>wrote:
>
> > Dear all,
> >
> > I had the possibility to run Amber11 across 2 Tesla C2050 GPUs and,
> > in order to check the accuracy of the simulation, I ran exactly the
> > same simulation on 4 CPUs, using the same Langevin random number
> > ig generated from the GPU run.
> >
> > Below there is the input file I used for my system (1561 atoms):
> >
> > &cntrl
> > imin = 0, irest = 1, ntx = 5,
> > ntb = 0,
> > igb = 5,
> > cut = 999.0,
> > temp0 = 343.0,
> > ntt = 3, gamma_ln = 1.0, ig = -1,
> > ntc = 2, ntf = 2,
> > nstlim = 500000000, dt = 0.002,
> > ntpr = 5000, ntwx = 1000, ntwr = 5000,
> > /
> >
> > For the CPU run I used ig = 857210 .
> >
> > I attached also a graph with RMSD values and a breakdown of energies
> > calculated for both GPU and CPU runs.
> >
> > Since I used the same random number for Langevin dynamics,
> > should I expect exactly the same behavior of RMSD and energies?
> >
> > Or the values in the graph compare anyway well and I am on the safe
> > side with regard to the accuracy of my GPU simulation?
> > If so, I would guess Amber has another source to make the values
> > unreproducible.
> >
> > Thanks in advance.
> >
> > All the best,
> > MP
> >
> > PS: I hope the graph is understandable.
> >
> >
> > --
> > Dr. Massimiliano Porrini
> > Institute for Condensed Matter and Complex Systems
> > School of Physics & Astronomy
> > The University of Edinburgh
> > James Clerk Maxwell Building
> > The King's Buildings
> > Mayfield Road
> > Edinburgh EH9 3JZ
> >
> > Tel +44-(0)131-650-5229
> >
> > E-mails : M.Porrini.ed.ac.uk
> > mozz76.gmail.com
> > maxp.iesl.forth.gr
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
>
>
> --
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Graduate Student
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 27 2011 - 13:30:03 PST