Re: [AMBER] Protocol for multiple CPU+ single GPU run on a single node,

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 14 May 2014 13:34:53 -0400

On Wed, 2014-05-14 at 20:13 +0300, MURAT OZTURK wrote:
> To clarify, pmemd.cuda.MPI is only there to facilitate multi GPU runs when
> GPUs are on different nodes then?

No, just multi-GPUs in general. In fact, if you use 2 GPUs on a single
node (that share a PCIe bus), pmemd.cuda.MPI will allocate 2 GPUs that
communicate via Peer-to-peer (dramatically improving scaling). MPI is
nice in that it's distributed so it _can_ work with multiple GPUs on
different nodes, but they don't have to be on different nodes.

> This is very different than gromacs where I can do multi cpu + multi gpu. I
> wonder how the performance will compare.

It depends, I think. The advantage of doing everything on-card is that
you never have to communicate between the GPU and CPU except on time
steps you write out trajectories, and such communication is rather
costly compared to how fast GPUs are. There's also no load-balancing
you need to optimize (i.e., figure out the "best" combo of CPUs to GPUs
to optimize performance) -- everything is done on the GPU so whenever a
faster GPU comes out, your simulation goes correspondingly faster.

OpenMM has a similar strategy for their GPU-accelerated platforms
(OpenCL and CUDA), although they have a plugin to perform a threaded,
single-precision version of the PME reciprocal space calculation on the
CPU to squeeze out a little extra performance on these heterogeneous
systems if you want.

HTH,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 14 2014 - 11:00:02 PDT
Custom Search