[AMBER] Run Amber on a system with multiple GPUs

From: <peter.stauffert.boehringer-ingelheim.com>
Date: Fri, 2 Sep 2011 19:55:35 +0200

Hi All,

we are running multiple pmemd.cuda jobs on a system with multiple nVidia 2050
GPUs, the GPUs are set in exclusive mode by nvidia-smi.
Of course, we can force a job to run on a specific GPU by 'pmemd -gpu n' or
by setting the environment variable CUDA_VISIBLE_DEVICES="n".
But unfortunately our scheduling system (SGE) does not know, which GPU is in
use, with the SGE, a GPU is simply a consumable resource like a license.
pmemd.cuda always uses the GPU with the highest device number if no GPU
number is specified even if this GPU is busy, error message:
cudaMemcpyToSymbol: SetSim copy to cSim failed all CUDA-capable devices are
busy or unavailable

Is there an option to force pmemd.cuda to use only empty GPUs?
Or is there a tool to test, whether a GPU is busy by another thread?

Of course, we can start pmemd.cuda and when it stops with the error message
above, we can restart it with a lower gpu number, but this is obviously no
good solution.

Kind regards,

Peter

Dr. Peter Stauffert
Boehringer Ingelheim Pharma GmbH & Co. KG
mailto:peter.stauffert.boehringer-ingelheim.com
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Sep 02 2011 - 11:00:05 PDT
Custom Search