Re: [AMBER] Select cuda ID device in PMEMD from Gonzalo Jimenez on 2011-11-17 (Amber Archive Nov 2011)

From: Gonzalo Jimenez <gjimenez.chem.ucla.edu>
Date: Thu, 17 Nov 2011 22:19:15 -0800

Hi Ross and Jason,

Your info was extremely useful, and I made it work finally, although the performance was crappy as Ross noted...

Correction: The code is setup to not use any used GPU within the same run.
It cannot determine information regarding the use of GPUs by other codes or
other instances of AMBER.

Right!

Exactly. Here you are running both calculations on GPUs 0 to 2 so each GPU
is running two calculations hence the slow down. What you want to do is set:

export CUDA_VISIBLE_DEVICES="0,1,2"

before running the first job and then

export CUDA_VISIBLE_DEVICES="3,4,5"

before running he second job.

That definetively worked!

You can check GPU utilization with the command

nvidia-smi –a

This was extremelly useful!
When running one mpirun –np 3 job, the utilization of GPUs 0,1,2 was 62%, 66%, 69%, and I got 10.28 ns/day for a 68000 atoms system which is not that bad...
But when I run the second mpirun –np 3 job on the same node after a while, it turns very bad: GPUs 0,1,2,3,4,5 utilization droped horribly: 6%, 6%, 6%, 7%, 5%, 6% and I only got 1.30 ns/day. That’s 10 times worse!!!

Note that the CUDA_VISIBLE_DEVICES command actually rebases the hardware GPU
ID. so in the second case here the code will still see the GPUs as ID's 0, 1
and 2 even though those physically correspond to GPUs 3,4 and 5.

Wow, good to know! I was always worried that the outputs always show the same ID numbers, but now I know the reason for that.

Note though that if you have 6 GPUs in a node this means, given current
PCI-E gen 2 limitations, that the GPUs are almost certainly sharing x16
channels. The GPUs will not all have dedicated x16 PCI-E bandwidth and so
parallel GPU performance is likely to be pretty terrible.

Exact and quite disappointing!

Unfortunately
these multiple GPUs per node machines are terribly designed but marketers
keep flogging them without any real regard or understanding of their actual
performance.

Right, this is what our cuda expert at Houk’s lab (UCLA), Narcis, always complain about...

This is the reason we created the MD SimCluster program. See:
http://ambermd.org/news.html#simcluster and
http://exxactcorp.com/testdrive/md/ for a machine that is well designed,
balanced and optimized for best performance with AMBER (and most parallel
GPU codes for that matter). Best advice is to avoid the likes of Dell etc
with their bargain basement GPU breakout boxes.

Yes, this kind of project is what we really need to improve the performance... We’ll have a close look at it...Unfortunately, the phiplosophy at supercomputer centers (we use UCLA, SDSC, NCSA clusters) is quite different, maybe because of more than respectable economic reasons.

Note if you already have the physical hardware you might want to consider
pulling two of the GPUs out of each node so each node has 4 GPUs with all
the GPUs in their own physical dedicated X16 slot. This is really the only
way at present (until Intel fixes their QPI to conform to the PCI-E spec and
allow peer to peer transfers) to get reasonable performance in parallel
across multiple GPUs.

Well, we do not have propietary nodes yet, and we’re planning to buy simple boxes with modest non-Tesla GPUs, which our expert (Narcis) know pretty well and are much cheaper. But we can talk about simcluster and prices offline, since we are planning to buy something affordable and powerful enough for the group these days. We’re not really convinced about the performance/price ratio of Teslas, but we can discuss about it.

Again, thanks a lot for all this useful and comprehensive info!

Best regards,

Gonzalo Jimenez
Postdoc Fellow, Houk’s lab
Dept. Chemistry & Biochemistry, UCLA

All the best
Ross

/\
\/
|\oss Walker

---------------------------------------------------------
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
---------------------------------------------------------

Note: Electronic Mail is not secure, has no guarantee of delivery, may not
be read every day, and should not be used for urgent or sensitive issues.

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Nov 17 2011 - 22:30:04 PST