Re: [AMBER] Amber CUDA calcualtion on GeForce GTX 590 ?

From: Ross Walker <>
Date: Wed, 30 Mar 2011 15:48:37 -0700

Hi Filip,

Yes I think you are getting confused as to what the GTX590 is here. It is essentially 2 clocked down GTX580s, each with 1.5GB of ram placed on the same physical PCB. SLI is of absolutely no use for computation, it is not a data channel and there is no way to send useful data across it. Reports of 1.5x faster than a GTX580 is based on simple gaming benchmarks where the two individual cards render alternative frames. You could effectively consider this to be each chip on the GTX590 being 75% the speed of a GTX580 chip. They are better than that and the lack of perfect scaling comes because you are effectively splitting their PCI-E x16 connection to the bus to be a shared x16 connection so effectively x8 each.

> GTX580 model that will be faster than GTX590. Secondly if you run one
> job, ok in parallel, we will have again totally 3GB and you can run XXX
> thousands atoms simulation.

Running in parallel does NOT (at present) combine the GPU memory. I.e. if the limit is say 800K atoms in serial on one GPU, running in parallel still gives you a limit of 800K atoms as there is a lot of duplication between the memory of two gpus and at present there is no easy way to avoid that. One could split the storage of atoms between GPU memories similar to what happens on the CPU in parallel but the communication overhead would negate any speedup benefit. So what you have here is essentially 2 independent slightly under clocked 1.5GB GTX580's in the same physical package.

> Thus my question here is probably for Ross:
> 1) How you expect the individual GPUs of GTX590 to scale in an Amber
> job?

I would expect the scaling when running parallel to be similar to running across 2 GTX580's in the same box. The scaling is improved on one side as the individual GPUs are slightly slower than the GTX580s but it is hurt on the other side by the fact that the bus to each GTX590 chip is effectively half (x8) that of the GTX580 (x16). The overall result is that I would expect scaling across the two GPUs in a GTX590 to be worse than 2 separate GTX580s. There is NO secret interconnect sauce inside the GTX590 package.

> I can not believe that if now two individual C2050 scales 1.4 it will
> be the same in the case of GTX590 cores. I don’t want to believe too
> that buying GTX590 we only safe one more slot when using Amber.

Yes the scaling will not be as good because 1) The cores are actually faster than a C2050 and more of them, so throughput on a single GPU is faster and therefore more communication is needed. 2) The interconnect between the two GTX590 cores is only PCI-E x8 instead of PCI-E x16 as in the C2050 case.

However!, if you run 2 single GPU jobs on a GTX590, one on each GPU it should run pretty well with minimal slowdown of either job as long as you do not write to disk too often.

> 2) My second question is: how actually to run GTX590, using cuda.MPI
> or? Here I am a bit confused too because if I am not wrong I saw Amber
> tests on GTX295 before parallel Cuda version of Amber to be released.

It is eactly the same as if you ahd 2 GTX580s in a single box. Run cudaDeviceQuery and you will see two GPU ID's reported. When you run the MPI code, mpirun -np 2 - it will place the first GPU thread on GPU ID 0 and the second on GPU ID 1. It makes no difference what the physical package looks like. You can also use CUDA_VISIBLE_DEVICES to control this. Short answer = just treat it like you had 2 individual 1.5GB GPUs in the same box.

Hope that helps.

Good luck,

|\oss Walker

| Assistant Research Professor |
| San Diego Supercomputer Center |
| Adjunct Assistant Professor |
| Dept. of Chemistry and Biochemistry |
| University of California San Diego |
| NVIDIA Fellow |
| | |
| Tel: +1 858 822 0854 | EMail:- |

Note: Electronic Mail is not secure, has no guarantee of delivery, may not be read every day, and should not be used for urgent or sensitive issues.

AMBER mailing list
Received on Wed Mar 30 2011 - 16:00:02 PDT
Custom Search