Hi Scott,
Sounds fine. The network case is different, sometimes due to heterogeneous hardware, but more often due to different responsiveness of different nodes due to net loading and other subtle issues. There are all sorts of indeterminacy issues that you hit, sort of like an OS, but they are not pathological - just order of completion, all still computationally correct, but they do effect the rounding error.
- Bob
________________________________________
From: Scott Le Grand [varelse2005.gmail.com]
Sent: Wednesday, September 19, 2012 2:38 PM
To: AMBER Mailing List
Subject: Re: [AMBER] Loading across GPUs
I don't disagree with your methods at all and I'm sure you got a lot of
benefit from it, but GPUs are qualitatively different beasts from CPUs in
that they are internally massively parallel, with each of them equal to ~50
modern CPU cores.
There is a great deal of deterministic dynamic load-balancing going on
internally within each GPU because of this. Having done so, this leaves
only table scraps for further load-balancing between GPUs.
The only scenario where this would be of significant benefit is when GPUs
with significantly different performance were mixed together in a parallel
run. But given this seems to be a corner case scenario and I only have
limited cycles for adding AMBER functionality, it's of very low priority.
Scott
On Wed, Sep 19, 2012 at 12:03 PM, Duke, Robert E Jr <rduke.email.unc.edu>wrote:
> Another comment - the way I tested this stuff was to include code that
> forced very rapid rebalance (for test only). That way, I could see
> balancing effects within the time window that dynamics does not drift due
> to fp rounding error.
> - Bob
> ________________________________________
> From: Scott Le Grand [varelse2005.gmail.com]
> Sent: Wednesday, September 19, 2012 11:18 AM
> To: AMBER Mailing List
> Subject: Re: [AMBER] Loading across GPUs
>
> And PS anything but static load balancing would lead to nondeterministic
> execution and that's a dealbreaker for my inner software engineer...
> On Sep 19, 2012 5:50 AM, "Jason Swails" <jason.swails.gmail.com> wrote:
>
> >
> >
> > On Sep 19, 2012, at 5:57 AM, Adam Jion <adamjion.yahoo.com> wrote:
> >
> > > Hi,
> > >
> > > Is it possible to control the computational loading across the GPUs for
> > a multi-GPU run of Amber?
> > > That is, I want GPU 1 to do 75% of the computation whilst GPU 2 does
> 25%
> > of the computation.
> >
> > No. The only reason I could imagine wanting to try this is if the 2 GPUs
> > are different and one is slower.
> >
> > If this is the case, you basically have to use them for different
> > simulations. For parallel sims, because of the serial nature of Molecular
> > Dynamics, a parallel job could very easily be slower than one run on the
> > fastest GPU (since it will be waiting for the slowest one to finish).
> >
> > HTH,
> > Jason
> >
> > --
> > Jason M. Swails
> > Quantum Theory Project,
> > University of Florida
> > Ph.D. Candidate
> > 352-392-4032
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 19 2012 - 16:00:02 PDT