Damn. I thought this bug was fixed. I'll re-open it.
For now just run single GPU jobs. You don't gain much with the current
release version of AMBER running across multiple GPUs (this will change in
the future but for now the gain is minimal). So just set all of your GPUs
to compute exclusive mode:
Add the following to /etc/rc.d/rc.local
nvidia-smi -pm 1
nvidia-smi -c 3
And reboot.
Then you can run 4 independent calculations at once and AMBER will just
'Do the Right thing()' when it comes to automatically selecting available
GPUs (no need for CUDA_VISIBLE_DEVICES) and all calculations should run at
full speed.
All the best
Ross
On 8/7/13 10:07 AM, "Kyle Sutherland-Cash" <khs26.cam.ac.uk> wrote:
>Hello,
>
>I saw a similar thread around November 2012 in which running these tests
>caused crashes. However, I'm running the cuda_parallel test suite and it
>doesn't seem to be implementing the NMR restraints properly every step.
>
>Example output from the diff is here:
>
>--------------------------------------------------------------------------
>---------------------------------
>
>possible FAILURE: check mdout.dif
>/home/khs26/amber12/test/cuda/nmropt/pme/nmropt_1_torsion
>131c131
>< NMR restraints: Bond = 0. Angle = 0. Torsion = 0.016
>> NMR restraints: Bond = 0. Angle = 0. Torsion = 0.
>149c149
>< NMR restraints: Bond = 0. Angle = 0. Torsion = 0.053
>> NMR restraints: Bond = 0. Angle = 0. Torsion = 0.
>167c167
>< NMR restraints: Bond = 0. Angle = 0. Torsion = 0.176
>> NMR restraints: Bond = 0. Angle = 0. Torsion = 0.
>
>--------------------------------------------------------------------------
>---------------------------------
>
>The tests on our machine don't seem to apply the torsion restraint
>properly
>in every step. It does work in some of the steps and the behaviour is the
>same for bond and angle restraints (sometimes it works, sometimes it
>doesn't). Is this a known bug?
>
>The corresponding (serial) CUDA tests ran just fine.
>
>The machine has 4xK20 GPUs and I compiled with ifort 13.1, mpich2 and CUDA
>5.5. I was running the tests with 4 parallel threads.
>
>Regards,
>
>Kyle
>
>--
>Kyle Sutherland-Cash
>PhD student, Wales group
>
>Department of Chemistry
>University of Cambridge
>Cambridge
>United Kingdom
>CB2 1DQ
>_______________________________________________
>AMBER mailing list
>AMBER.ambermd.org
>http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Aug 07 2013 - 11:00:04 PDT