Re: [AMBER] pmemd cuda MPI and PBS_GPUFILE from Jason Swails on 2012-11-06 (Amber Archive Nov 2012)

From: Jason Swails <jason.swails.gmail.com>
Date: Tue, 6 Nov 2012 10:50:52 -0500

On Tue, Nov 6, 2012 at 12:38 AM, Ross Walker <ross.rosswalker.co.uk> wrote:

> >Hi,
> >
> >http://ambermd.org/gpus/#Running
> >" Ideally you would have a batch scheduling system that will set
> >everything up for you correctly "
> >
> >In fact, PBS does just that with its PBS_GPUFILE, e.g.,
> >#PBS -l nodes=2:ppn=x:gpus=2
> >...
> >cat $PBS_GPUFILE
> >cat /var/spool/batch/torque/aux//517906.batch.edugpu
> >n0659-gpu1
> >n0659-gpu0
> >n0658-gpu1
> >n0658-gpu0
> >
> >And a reliable PBS source indicates that the PBS_GPUFILE and its syntax
> >are stable.
> >When will pmemd support PBS_GPUFILE ?
>

In response to Scott, I can't imagine this happening, at least in the
foreseeable future. The effort that would be put into learning and
implementing the required parts of the PBS API will most likely go into
feature development and enhancements instead. IMO, it's the MPIs that
should support this, not the CUDA applications themselves. mpiexec and/or
mpirun should, when compiled against the existing torque API, be able to
descriminate and launch processes strictly on the allocated GPUs. Most
(all?) MPIs already have the code to support torque integration, so it
seems a simpl*er* task for them, and well worth generalizing above and
beyond pmemd.cuda(.MPI).

>
> >Please provide a workaround script that takes a $PBS_GPUFILE and spews
> >all the necessary environment variables to run on the specified gpus.
>
> Volunteers? - Should be pretty simple for some Bash whizz to figure this
> out.
>

This is surprisingly not simple to do in general if/when you use GPUs
scattered across different nodes. Suppose you have 3 GPUs per node (e.g.,
Keeneland), and you want to use 8 total GPUs (say, for a REMD job or
something). To make things clean, we ask for 4 nodes, 2 GPUs per node, so
we are charged only for what we need. PBS_GPUFILE can now point to GPU 0,
1 on node 1, 1, 2 on node 2, etc, based on any GPUs that may be used
already. (We can take this a step further and just ask for any 8 GPUs
regardless of the node/GPU #).

So you need to be able to set this environment variable on a per-thread
basis. As this is unnecessary for CPUs, I don't think this has really been
addressed before.

The staff at the UF HPC has written a script that seems to work correctly
(that is, CUDA_VISIBLE_DEVICES is set on a per-process basis so that only
the GPUs specified in PBS_GPUFILE are used).

The solution is here: http://wiki.hpc.ufl.edu/doc/CUDA#pbsgpu-wrapper (I
have attached the pbsgpu-wrapper script they reference in there).

Note in many cases this may be overkill. If you are required to request
entire nodes and all GPUs on it (or you do, as general practice), then this
is unnecessary (just let the GPUs be chosen by default). If you are
running only on a single node, you can parse PBS_GPUFILE directly and set a
single CUDA_VISIBLE_DEVICES for all threads.

All the best,
Jason

-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

application/octet-stream attachment: pbsgpu-wrapper

Received on Tue Nov 06 2012 - 08:00:03 PST