Re: [AMBER] pmemd cuda MPI and PBS_GPUFILE

From: Jodi Ann Hadden <jodih.uga.edu>
Date: Tue, 6 Nov 2012 16:23:59 +0000

Just wanted to note that the line

#PBS -l nodes=2:ppn=2:gpus=2

will not work if job submission incorporates a scheduler that does not understand GPUs as a resource, such as MAUI. We ran into this problem last year when we were setting up our GPU queuing systems. While Torque/PBS understands the gpus=2 bit, MAUI does not and so the job will just remained queued because MAUI doesn't realize it has this resource available.

Without the explicit request of a GPU reservation, the $PBS_GPUFILE is empty, and you have to find some other way to handle GPU assignment in order to ensure jobs are not sent to a GPU that's already busy.

What we ended up doing to get around this was to make resource requests based on CPUs since the ratio of GPUS:CPUS is 1:1 in AMBER:

#PBS -l nodes=2:ppn=2

And then in our job submission script, we assign CUDA_VISIBLE_DEVICES to the "available" GPUs in that node, or those that are not currently running a job. We do this with:

export CUDA_VISIBLE_DEVICES=`gpu-free`

Where gpu-free is a bash script I wrote, made available at http://ambermd.org/gpus/#Running
It calls the NVIDIA-SMI to determine which GPUs are showing utilization, and only adds those that aren't to the list of free GPU IDs it returns. We set our GPUs to compute:exclusive_process mode as a failsafe for this method, to ensure jobs never stack on the same GPU. Because the only problem is that if you don't wait ~10 seconds between job submissions, the gpu-free script is inaccurate on the second call since the first GPU hasn't had time to register utilization in the NVIDIA-SMI. The second job tries to go to the same GPU as the first, but exclusive_process mode means the second job will crash, luckily with a distinct error message so that users know what the problem was and that they should simply resubmit.

There is most certainly a more elegant solution to this problem, but as a n00b building/configuring her first multi-node supercomputer, it was the only hack I managed to come up with. Our lab has been using this method for a year, and it works well for us.


On Nov 6, 2012, at 12:38 AM, Ross Walker <ross.rosswalker.co.uk<mailto:ross.rosswalker.co.uk>> wrote:

Hi,

http://ambermd.org/gpus/#Running
" Ideally you would have a batch scheduling system that will set
everything up for you correctly "

In fact, PBS does just that with its PBS_GPUFILE, e.g.,
#PBS -l nodes=2:ppn=x:gpus=2
...
cat $PBS_GPUFILE
cat /var/spool/batch/torque/aux//517906.batch.edugpu
n0659-gpu1
n0659-gpu0
n0658-gpu1
n0658-gpu0

And a reliable PBS source indicates that the PBS_GPUFILE and its syntax
are stable.
When will pmemd support PBS_GPUFILE ?

Please provide a workaround script that takes a $PBS_GPUFILE and spews
all the necessary environment variables to run on the specified gpus.

Volunteers? - Should be pretty simple for some Bash whizz to figure this
out.

Although in this situation it is pretty simple since it is homogenous. So
in pseudo code just:

1) grep for first node id > foo
2) extrac last character from each line in foo > foo2
3) export CUDA_VISIBLE_DEVICES=contents of foo2
4) mpirun -np (line count in $PBS_GPUFILE) -option to export environment
variables $AMBERHOME/bin/pmemd.cuda.MPI

That 'should' work. Alternatively in your case if you are using all the
GPUs in a node, i.e you have 2 gpus per node then the following:

#PBS -l nodes=2:ppn=2:gpus=2

should when run with mpirun -np 4 just 'do the right thing'(tm)

All the best
Ross

/\
\/
|\oss Walker

---------------------------------------------------------
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk<mailto:ross.rosswalker.co.uk> |
---------------------------------------------------------

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.






_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Nov 06 2012 - 08:30:04 PST
Custom Search