Hi Sasha,
> has anyone done any comparison testing to determine the optimal number
> of CPU cores per executing GPU?
I have not fully tested this. However the code is currently single threaded with respect to a single GPU. Hence each GPU calculation runs only on a single core so you technically just need the same number of cores as GPUs. The same will be true when we go over to multiple GPU support in a single run. This will still use 1 thread per GPU.
> I'm thinking of putting together a GPU system to run Amber. In this
> case, would a single quad core Xeon be enough for four (4) Fermi cards
> (GTX480, for example)? One core per card, in this case. Or should I
> really go for a 2:1 ratio and a dual quad-core cpu with fewer GPUs?
> Are there any other considerations for a multi-GPU hardware
> configuration?
Given the above this would seem reasonable with a couple of caveats. Firstly if you run 4 GPU runs you will peg all 4 cores. If you only have 4 cores in the machine this leaves nothing left for managing system functions, network connection, various demons, I/O etc. Thus it is possible you would see degradation due to this when the OS has to swap things in and out occasionally. I do not have precise numbers for what this affect would be however.
A bigger caveat though is to look carefully at what the motherboard actually supports. A lot of the single socket motherboards probably do not have independent PCI/E 2.0 links. Hence I bet if you put 4 cards in there it downgrades them all to x8 or even maybe x4 speed. This will cause BIG performance issues. A lot of the consumer gaming boards are like this. In this scenario it looks 'cool' to be able to stick 4 GPUs in and most gaming graphics upload a bunch of textures and then rarely change them so they don't max out the PCI express links so people don't notice if the board doesn't really support x16 across all channels simultaneously.
Most dual socket boards are 'server / workstation' class boards and are specifically designed with this in mind so the x16 channels are actually independent so when you add additional cards you don't get a slow down. This is probably more of a consideration than the number of cores, since you could potentially address the former by getting a single socket board and sticking a 6 core chip in it.
If you send a message to me directly I can send you a specific design and quote for a 4 x C2050 machine that does not have the issues above. I plan to benchmark things on this and then put it up as an example on the GPU page in a week or so.
All the best
Ross
/\
\/
|\oss Walker
| Assistant Research Professor |
| San Diego Supercomputer Center |
| Tel: +1 858 822 0854 | EMail:- ross.rosswalker.co.uk |
|
http://www.rosswalker.co.uk |
http://www.wmd-lab.org/ |
Note: Electronic Mail is not secure, has no guarantee of delivery, may not be read every day, and should not be used for urgent or sensitive issues.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 26 2010 - 16:00:03 PDT