Re: [AMBER] gpu %utils vs mem used

From: Mark Abraham <mark.j.abraham.gmail.com>
Date: Tue, 25 Jul 2017 18:48:40 +0000

Hi,

On Tue, 25 Jul 2017 16:31 Ross Walker <ross.rosswalker.co.uk> wrote:

> Hi Henk,
>
> If only the laws of physics were so accommodating. ;-)
>
> Unfortunately just because the GPU reports itself as 75% utilized does not
> mean the other 25% is available for additional computation. The missing 25%
> is lost in inefficiencies waiting for memory access, other cores to finish
> computing etc. If the speed of light was infinite one could just swap out
> tasks (and memory) instantly and do what you suggest. Unfortunately the
> speed of light is way too slow so the overhead in task switching swamps
> anything you might stand to gain by attempting this.
>
> In terms of sizing - for AMBER it's super simple since we designed it to
> run everything on the GPU and not have the headache of having to match CPU
> cores etc. You simply run one calculation per GPU. It is designed so that
> GPUs don't interfere with each other so for cost effectiveness one should
> just max out the GPUs per node - so say 8 GPUs (1080TIs are ideal) on a
> dual socket node and then run 8 jobs on that single node. You can buy low
> core and clock speed CPUs to save money since you only need one core per
> GPU.
>
> For Lammps and Gromacs it's way more complicated (especially Gromacs)
> since they try to use the CPUs at the same time - this leads to
> inefficiencies in utilizing the GPUs so you max out at about 2 GPUs per
> node.


Indeed.

You also need to by high speed CPUs which have a massive price premium
> which makes the effective performance per dollar way lower with Gromacs.
> Probably your best bet if you have to use both AMBER and Gromacs together
> is either accept that a chunk of your node will be idle when running
> Gromacs or if you have say 8 GPUs in a node and 40 cores have a single
> Gromacs job use 32 cores and 2 GPUs and then also run 6 single GPU AMBER
> jobs on the same node. It needs some complicated scheduler configs to work
> but it's possible.


Indeed.

Note since AMBER sits entirely on a GPU so you can run multiple jobs on a
> node without contention (1 per GPU). This is not true with Gromacs due to
> all the CPU to CPU and CPU to GPU communication that floods the
> communication channels between CPU cores and the PCI-E bus to the GPU. As
> such you can't reliably run say 2 Gromacs jobs on the same node where one
> uses 20 cores and 2 GPUs and another uses the remaining 20 cores and
> remaining 2 GPUs.
>

Uh, no. These run fine (ie *reliable*) and when set up properly will
naturally run out of phase with eachother and maximise throughput. See e.g.
http://onlinelibrary.wiley.com/doi/10.1002/jcc.24030/abstract;jsessionid=3CD2B4EE326378381D60FCB0BD1B26A0.f02t02
(or same on arxiv.

Mark

Hope that helps. Unfortunately the conflicting code designs means there is
> no ideal config for AMBER, Lammps and Gromacs. :-(
>
> All the best
> Ross
>
> > On Jul 25, 2017, at 9:06 AM, Meij, Henk <hmeij.wesleyan.edu> wrote:
> >
> > Indeed that helps Ross. My thought experiment went along the lines of:
> if a gpu is 75% utilized and you have two of them then half a "virtual" gpu
> is idle with two gpus, 5 idle gpus in 20, and so on. If that pattern
> persists into the 200+ range and onwards that's a lot of resources. If I
> could provide virtual gpus and size them to simulation requirements that
> would be ideal. Or buy gpus more fitting to our regular type jobs, but that
> is a difficult target.
> >
> >
> > Gets really complicated with gromacs where multiple mpi ranks can share
> the gpu or multiple gpus. Any pointers to how to best size your gpu
> environment to software requirements appreciated; we run mostly amber,
> lammps and gromacs.
> >
> >
> > -Henk
> >
> > ________________________________
> > From: Ross Walker <ross.rosswalker.co.uk>
> > Sent: Monday, July 24, 2017 3:16:09 PM
> > To: AMBER Mailing List
> > Subject: Re: [AMBER] gpu %utils vs mem used
> >
> > Hi Henk,
> >
> > Why would you assume that it would make sense to run more than one job
> on a single GPU? The AMBER code (and pretty much every other GPU code) is
> designed to use as much of a GPU as possible. Sure you can run 2 jobs on
> the same GPU but they'll end up running at half the speed or less (due to
> contention) each. The memory consideration is largely unrelated to
> performance. The memory usage, for AMBER, is a function of the size of the
> simulation you are running and, to a lesser extent, the choice of
> simulation options (NVT vs NPT, thermostat etc). The total floating point
> operations per byte is high in AMBER, each atom takes around 72 bytes to
> store the coordinates, forces and velocities but it is involved in a huge
> number of interactions involving bonds, angles, dihedrals, pair wise
> electrostatic and van der waals interactions and all the FFT framework
> making up the PME reciprocal space. The net result is that it is perfectly
> reasonable for a small simulation using a couple of h!
> > undred MB of memory to max out the compute units on the GPU itself.
> >
> > Hope that helps,
> >
> > All the best
> > Ross
> >
> >
> >> On Jul 24, 2017, at 2:20 PM, Meij, Henk <hmeij.wesleyan.edu> wrote:
> >>
> >> Hi All, this is not a pure Amber question, I observe the same with my
> Lammps users, but I figured there may be gpu expertise on this list to give
> me some insights.
> >>
> >>
> >> My K20 environment is running with exclusive/persistent enabled. Taking
> a look at the size of the jobs I was wondering going the disabled route and
> push more jobs through.
> >>
> >>
> >> But how/why do these tiny jobs each push gpu %util to above 70% while
> consuming such little memory? If that's real then the gpu can only handle
> one such job at a time?
> >>
> >>
> >> -Henk
> >>
> >>
> >> Mon Jul 24 13:51:26 2017
> >> +------------------------------------------------------+
> >> | NVIDIA-SMI 4.304.54 Driver Version: 304.54 |
> >>
> |-------------------------------+----------------------+----------------------+
> >> | GPU Name | Bus-Id Disp. | Volatile
> Uncorr. ECC |
> >> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util
> Compute M. |
> >>
> |===============================+======================+======================|
> >> | 0 Tesla K20m | 0000:02:00.0 Off |
> 0 |
> >> | N/A 40C P0 98W / 225W | 4% 205MB / 4799MB | 77% E.
> Process |
> >>
> +-------------------------------+----------------------+----------------------+
> >> | 1 Tesla K20m | 0000:03:00.0 Off |
> 0 |
> >> | N/A 41C P0 106W / 225W | 5% 253MB / 4799MB | 72% E.
> Process |
> >>
> +-------------------------------+----------------------+----------------------+
> >> | 2 Tesla K20m | 0000:83:00.0 Off |
> 0 |
> >> | N/A 26C P8 16W / 225W | 0% 13MB / 4799MB | 0% E.
> Process |
> >>
> +-------------------------------+----------------------+----------------------+
> >> | 3 Tesla K20m | 0000:84:00.0 Off |
> 0 |
> >> | N/A 27C P8 15W / 225W | 0% 13MB / 4799MB | 0% E.
> Process |
> >>
> +-------------------------------+----------------------+----------------------+
> >>
> >>
> +-----------------------------------------------------------------------------+
> >> | Compute processes: GPU
> Memory |
> >> | GPU PID Process name
> Usage |
> >>
> |=============================================================================|
> >> | 0 16997 pmemd.cuda.MPI
> 190MB |
> >> | 1 16998 pmemd.cuda.MPI
> 238MB |
> >>
> +-----------------------------------------------------------------------------+
> >>
> >> _______________________________________________
> >> AMBER mailing list
> >> AMBER.ambermd.org
> >> http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jul 25 2017 - 12:00:03 PDT
Custom Search