Re: [AMBER] GPUs parallel problem

From: Ross Walker <ross.rosswalker.co.uk>
Date: Thu, 4 May 2017 11:16:03 -0400

>
> I guess the short answer is "you don't". Apart from what Andreas sent, I have two GTX-1070 GPUs and the best parallelization performance I could achieve so far was 30% performance for the second card. Benchmarks at the Amber website suggest a similar picture:
>
> http://ambermd.org/gpus/benchmarks.htm
>
> In addition to software limitations, by running four GPUs at once you're likely to fry your machine (assuming all the cards are on a single unit) due to excessive amounts of heat generated, or surpass power supply capacity. For a membrane protein system, 2 GPUs were enough to cause our unit to shut down, although it might not be the case necessarily for everyone.

This is simply not true. You can happily run 4 GPUs in a box at once. Hell I've even built machines with 20 GPUs in a single box and run them all flat out. Now that assumes you built or specced the system correctly. For example all the 4, 8 and 10 way Exxact GPU systems I designed with them for AMBER (https://exxactcorp.com/index.php/solution/solu_list/65 <https://exxactcorp.com/index.php/solution/solu_list/65>) are all specced to run GPUs flat out and the CPU as well if you want. I've even built clusters with hundreds of GPUs in (Both GeForce and Tesla) with no issues. Similarly the self build machine I spec here (recommended_hardware.htm <http://ambermd.org/gpus/recommended_hardware.htm#diy>) will happily run with 3 GPUs. If you use a different case you can even get 4 GPUs in there no problem. This was the basis of the design used for the NVIDIA Digits Dev Box (https://exxactcorp.com/deep-learning-workstations-servers.php <https://exxactcorp.com/deep-learning-workstations-servers.php>).

If your system is shutting down during a run when using multiple GPUs then it is just badly designed.

Granted scaling to 4 GPUs only really works for large GB runs and scaling to 2 GPUs even with P2P is not great on modern GPUs there is nothing to stop you running 4 independent calculations all at the same time - they will all run at full speed and, assuming you designed the system correctly, or bought an AMBER certified system like those I link to above you will have no issues. One of the great things about the way the AMBER GPU code is designed, compared with Gromacs and NAMD which lean heavily on the CPU in addition to the GPU, is that GPU jobs barely touch the CPU. As such one can load up nodes with lots of GPUs and save substantial money on all the ancillary components that would be needed for buying multiple 1 GPU nodes (massively so if you use GeForce GTX1080TI cards at ~$699 each).

All the best
Ross
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu May 04 2017 - 08:30:03 PDT
Custom Search