Re: [AMBER] Amber CUDA calcualtion on GeForce GTX 590 ?

From: filip fratev <filipfratev.yahoo.com>
Date: Wed, 30 Mar 2011 16:17:34 -0700 (PDT)

Hi Ross,
Ok, thank you for the detailed answer! Now everything is clear. When we have some drivers I will post some bench results here.

Regards,
Filip


--- On Thu, 3/31/11, Ross Walker <ross.rosswalker.co.uk> wrote:

> From: Ross Walker <ross.rosswalker.co.uk>
> Subject: Re: [AMBER] Amber CUDA calcualtion on GeForce GTX 590 ?
> To: "'AMBER Mailing List'" <amber.ambermd.org>
> Date: Thursday, March 31, 2011, 1:48 AM
> Hi Filip,
>
> Yes I think you are getting confused as to what the GTX590
> is here. It is essentially 2 clocked down GTX580s, each with
> 1.5GB of ram placed on the same physical PCB. SLI is of
> absolutely no use for computation, it is not a data channel
> and there is no way to send useful data across it. Reports
> of 1.5x faster than a GTX580 is based on simple gaming
> benchmarks where the two individual cards render alternative
> frames. You could effectively consider this to be each chip
> on the GTX590 being 75% the speed of a GTX580 chip. They are
> better than that and the lack of perfect scaling comes
> because you are effectively splitting their PCI-E x16
> connection to the bus to be a shared x16 connection so
> effectively x8 each.
>
> > GTX580 model that will be faster than GTX590. Secondly
> if you run one
> > job, ok in parallel, we will have again totally 3GB
> and you can run XXX
> > thousands atoms simulation.
>
> Running in parallel does NOT (at present) combine the GPU
> memory. I.e. if the limit is say 800K atoms in serial on one
> GPU, running in parallel still gives you a limit of 800K
> atoms as there is a lot of duplication between the memory of
> two gpus and at present there is no easy way to avoid that.
> One could split the storage of atoms between GPU memories
> similar to what happens on the CPU in parallel but the
> communication overhead would negate any speedup benefit. So
> what you have here is essentially 2 independent slightly
> under clocked 1.5GB GTX580's in the same physical package.
>
> > Thus my question here is probably for Ross:
> > 1) How you expect the individual GPUs of GTX590 to
> scale in an Amber
> > job?
>
> I would expect the scaling when running parallel to be
> similar to running across 2 GTX580's in the same box. The
> scaling is improved on one side as the individual GPUs are
> slightly slower than the GTX580s but it is hurt on the other
> side by the fact that the bus to each GTX590 chip is
> effectively half (x8) that of the GTX580 (x16). The overall
> result is that I would expect scaling across the two GPUs in
> a GTX590 to be worse than 2 separate GTX580s. There is NO
> secret interconnect sauce inside the GTX590 package.
>
> > I can not believe that if now two individual C2050
> scales 1.4 it will
> > be the same in the case of GTX590 cores. I don’t
> want to believe too
> > that buying GTX590 we only safe one more slot when
> using Amber.
>
> Yes the scaling will not be as good because 1) The cores
> are actually faster than a C2050 and more of them, so
> throughput on a single GPU is faster and therefore more
> communication is needed. 2) The interconnect between the two
> GTX590 cores is only PCI-E x8 instead of PCI-E x16 as in the
> C2050 case.
>
> However!, if you run 2 single GPU jobs on a GTX590, one on
> each GPU it should run pretty well with minimal slowdown of
> either job as long as you do not write to disk too often.
>
> > 2) My second question is: how actually to run GTX590,
> using cuda.MPI
> > or? Here I am a bit confused too because if I am not
> wrong I saw Amber
> > tests on GTX295 before parallel Cuda version of Amber
> to be released.
>
> It is eactly the same as if you ahd 2 GTX580s in a single
> box. Run cudaDeviceQuery and you will see two GPU ID's
> reported. When you run the MPI code, mpirun -np 2 - it will
> place the first GPU thread on GPU ID 0 and the second on GPU
> ID 1. It makes no difference what the physical package looks
> like. You can also use CUDA_VISIBLE_DEVICES to control this.
> Short answer = just treat it like you had 2 individual 1.5GB
> GPUs in the same box.
>
> Hope that helps.
>
> Good luck,
> Ross
>
> /\
> \/
> |\oss Walker
>
> ---------------------------------------------------------
> |         
>    Assistant Research Professor   
>           |
> |            San Diego
> Supercomputer Center         
>    |
> |         
>    Adjunct Assistant Professor   
>            |
> |         Dept. of Chemistry
> and Biochemistry       
>    |
> |          University of
> California San Diego       
>    |
> |               
>      NVIDIA Fellow     
>            
>    |
> | http://www.rosswalker.co.uk | http://www.wmd-lab.org/ |
> | Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk 
> |
> ---------------------------------------------------------
>
> Note: Electronic Mail is not secure, has no guarantee of
> delivery, may not be read every day, and should not be used
> for urgent or sensitive issues. 
>
>
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>


      

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Mar 30 2011 - 16:30:03 PDT
Custom Search