On Mon, Jan 11, 2016 at 8:52 AM, Indrajit Deb <biky2004indra.gmail.com>
wrote:
> Dear All,
>
> I am submitting several pmemd.cuda jobs through PBS in a supercomputing
> facility. The facility have several nodes with two Tesla M2050 GPUs. I am
> using the following PBS script according to the instructions given by the
> administrator....
>
> #!/bin/bash
> #PBS -q standard
> #PBS -l nodes=1:ppn=1:gpu2
> #PBS -l mem=62gb
> #PBS -l walltime=48:00:00
> # Set the path and loads the appropriate modules
> module load amber/14
> # Lustre file system - shared between nodes
> mkdir -p /tmp/lustre_shared/$USER/$PBS_JOBID
> export TMPDIR=/tmp/lustre_shared/$USER/$PBS_JOBID
> # Move to the WORKING DIRECTORY
> cd $PBS_O_WORKDIR
> # Copy input data in the variable $TMPDIR
> cp prodrun2.in $TMPDIR
> cp rna.prmtop $TMPDIR
> cp prodrun1.rst $TMPDIR
> cp DIST.rst $TMPDIR
> # Move to the $TMPDIR
> cd $TMPDIR
> pmemd.cuda -O -i prodrun2.in -o prodrun2.out -p rna.prmtop -c prodrun1.rst
> -r prodrun2.rst -x prodrun2.nc -ref prodrun1.rst -inf prodrun2.info
> # We finish the calculation and transfer files to working directory
> cp -r $TMPDIR $PBS_O_WORKDIR
>
> Now the problem is that......sometimes unfortunately two jobs are going in
> the same node and in the same GPU. The ouput is 21ns/day for each job. But
> when single jobs is running the output is 42ns/day for that job.
>
You probably need to set the environment variable CUDA_VISIBLE_DEVICES to
point to the GPU that torque assigned you. You should talk to the help
staff for your supercomputer to figure out how to best do that for your
queuing system.
One of the downsides of torque/PBS (last time I checked), is that they had
rather poor GPU management facilities. It will assign you a specific GPU,
but then make no effort to ensure the job used the correct one. (It's a
difficult problem, to be sure).
HTH,
Jason
--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Jan 11 2016 - 06:30:03 PST