Re: [AMBER] pmemd cuda MPI and PBS_GPUFILE from Scott Le Grand on 2012-11-07 (Amber Archive Nov 2012)

From: Scott Le Grand <varelse2005.gmail.com>
Date: Wed, 7 Nov 2012 09:52:20 -0800

Actually, if you set exclusive process mode on all your GPUs, and then use
a scheduler that understands GPUs as well, AMBER should work correctly...

That went into AMBER 12...

On Wed, Nov 7, 2012 at 9:25 AM, Scott Brozell <sbrozell.rci.rutgers.edu>wrote:

> Hi,
>
> On Tue, Nov 06, 2012 at 10:50:52AM -0500, Jason Swails wrote:
> > On Tue, Nov 6, 2012 at 12:38 AM, Ross Walker <ross.rosswalker.co.uk>
> wrote:
> > > >http://ambermd.org/gpus/#Running
> > > >" Ideally you would have a batch scheduling system that will set
> > > >everything up for you correctly "
> > > >
> > > >In fact, PBS does just that with its PBS_GPUFILE, e.g.,
> > > >#PBS -l nodes=2:ppn=x:gpus=2
> > > >...
> > > >cat $PBS_GPUFILE
> > > >cat /var/spool/batch/torque/aux//517906.batch.edugpu
> > > >n0659-gpu1
> > > >n0659-gpu0
> > > >n0658-gpu1
> > > >n0658-gpu0
> > > >
> > > >And a reliable PBS source indicates that the PBS_GPUFILE and its
> syntax
> > > >are stable.
> > > >When will pmemd support PBS_GPUFILE ?
> >
> > In response to Scott, I can't imagine this happening, at least in the
> > foreseeable future. The effort that would be put into learning and
> > implementing the required parts of the PBS API will most likely go into
> > feature development and enhancements instead. IMO, it's the MPIs that
> > should support this, not the CUDA applications themselves. mpiexec
> and/or
> > mpirun should, when compiled against the existing torque API, be able to
> > descriminate and launch processes strictly on the allocated GPUs. Most
> > (all?) MPIs already have the code to support torque integration, so it
> > seems a simpl*er* task for them, and well worth generalizing above and
> > beyond pmemd.cuda(.MPI).
>
> That seems reasonable, so i'll contact them.
>
>
> > > >Please provide a workaround script that takes a $PBS_GPUFILE and spews
> > > >all the necessary environment variables to run on the specified gpus.
> > >
> > > Volunteers? - Should be pretty simple for some Bash whizz to figure
> this
> > > out.
> >
> > This is surprisingly not simple to do in general if/when you use GPUs
> > scattered across different nodes. Suppose you have 3 GPUs per node
> (e.g.,
> > Keeneland), and you want to use 8 total GPUs (say, for a REMD job or
> > something). To make things clean, we ask for 4 nodes, 2 GPUs per node,
> so
> > we are charged only for what we need. PBS_GPUFILE can now point to GPU
> 0,
> > 1 on node 1, 1, 2 on node 2, etc, based on any GPUs that may be used
> > already. (We can take this a step further and just ask for any 8 GPUs
> > regardless of the node/GPU #).
> >
> > So you need to be able to set this environment variable on a per-thread
> > basis. As this is unnecessary for CPUs, I don't think this has really
> been
> > addressed before.
> >
> > The staff at the UF HPC has written a script that seems to work correctly
> > (that is, CUDA_VISIBLE_DEVICES is set on a per-process basis so that only
> > the GPUs specified in PBS_GPUFILE are used).
> >
> > The solution is here: http://wiki.hpc.ufl.edu/doc/CUDA#pbsgpu-wrapper (I
> > have attached the pbsgpu-wrapper script they reference in there).
> >
> > Note in many cases this may be overkill. If you are required to request
> > entire nodes and all GPUs on it (or you do, as general practice), then
> this
> > is unnecessary (just let the GPUs be chosen by default). If you are
> > running only on a single node, you can parse PBS_GPUFILE directly and
> set a
> > single CUDA_VISIBLE_DEVICES for all threads.
>
> Excellent, thanks for the scripts.
> FWIW, i already knew that it was not a trivial problem to write a
> general purpose robust script and to handle non full node requests.
> But a solution that starts with the PBS_GPUFILE, whether processed by MPI
> implementation, pmemd binaries, or helper scripts
> is overdue IMO.
>
> The current pmemd approach with user setup of CUDA_VISIBLE_DEVICES might
> be ok for home grown clusters, but is not ready for prime time.
> I recommend that some polish (author info, etc) be added to your
> institution's scripts and that they be distributed with Amber.
>
>
> On Tue, Nov 06, 2012 at 04:23:59PM +0000, Jodi Ann Hadden wrote:
> > Just wanted to note that the line
> >
> > #PBS -l nodes=2:ppn=2:gpus=2
> >
> > will not work if job submission incorporates a scheduler that does not
> understand GPUs as a resource, such as MAUI.
>
> We have a scheduler that understands gpus, but thanks for your response.
>
> scott
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Nov 07 2012 - 10:00:02 PST