Re: [AMBER] Select cuda ID device in PMEMD

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 18 Nov 2011 09:41:13 -0800

Hi Gonzalo,

> Just a final tip following on this: running 6 replicas of the same job
> using the serial pmemd.cuda was infinitely better from the performance
> point of view: 95% of GPUs utilization and 6.6 ns/day per job [that
> means 6*6.5=39.5 ns/day on the whole node] (versus 8.5 ns/day with a
> single 2-GPU-1-node job and 10.3 ns/day with a single 3-GPU-1-node
> job). In conclusion, whatever you do by running more than one MPI job
> in the same node is a disaster.

Yes. What happens is that for serial GPU jobs there is no need to communicate between the GPUs. AMBER is designed such that all the calculation is run on the GPU and so the only time a serial job communicates over the PCI-E bus is when it does I/O. Thus as long as you set ntpr, ntwx etc reasonably high, say ntpr=2000,ntwx=2000,ntwr=100000,ntwe=0,ntwv=0 then the performance for single GPU will being largely unaffected by the actual PCI-E bus speed.

When you run a single job in parallel across multiple GPUs however then each GPU needs to communicate through CPU memory and across the PCI-E bus. The PCI-E speed then becomes critical. In your case what happens is the first 3 GPU run uses most of the limited PCI-E bandwidth that the machine has. When you fire up a second 3GPU run this just saturates the PCI-E interface and both sets of jobs get starved of bandwidth and the whole dies a horrible death. One can tweak performance by checking what IOH chips the various GPUs are on and how they are interconnected but either way a machine can only drive 4 GPUs at full speed anyway so with 6 in a node you are never going to be able to use them all effectively for anything other than single GPU runs.

With 4 GPUs in a node, two on each IOH chip at full x16 speed each what you should do is run the first 2 GPU run on the 2 GPUs on IOH0 and the second 2 GPU run on the GPUs on IOH1. One has to do some detective work to figure this out though. Best to ask whoever runs the machine to give you the affinity of each GPU. They can open it up and look at the GPU serial numbers and figure out which slot is attached to witch IOH chip and then map this internally to actual software GPU ID's. Something that should be done by whoever built the machine in the first place.

> Our only alternative to use effectively these 6-GPUs nodes is to run
> serial pmemd.cuda, or in the worst case to run a single 3-GPU-1-node
> job (but we will be wasting half of the node).

One alternative that 'might' work depending on how the machine is actually wired and how it responds to multiple GPUs in use (i.e. does it 'share' the bandwidth or does it 'divide' it - subtle but important difference) is to run a single parallel job and a bunch of serial jobs. I.e. you could probably run a 3 GPU job and 3 single GPU runs without too much hit assuming the 3 single GPU jobs do not write to I/O very often. You could also probably run 2 x 2GPU jobs and 2 x 1 GPU jobs without too much hit as long as each of the 2GPU parallel jobs are confined to GPUs on the same IOH controller.

Good luck... Aren't these wonderful 'break out box - stoopercomputers' fun? - I really wish the marketing people who flog these were forced to actually use them.

Ps. There is nothing to stop you configuring a MD SimCluster with GTX580s in it (and you can skip the IB interconnect to save money if you only want to run jobs within nodes (i.e. 2 x GTX580s per node). Shoot me an email off list and I can put you in contact with the right people who can give you some pricing on various options.

All the best
Ross

/\
\/
|\oss Walker

---------------------------------------------------------
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
---------------------------------------------------------

Note: Electronic Mail is not secure, has no guarantee of delivery, may not be read every day, and should not be used for urgent or sensitive issues.





_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Nov 18 2011 - 10:00:03 PST
Custom Search