Ross, Scott,
thank you for the feedback. Clearly, any future mpi version should go
over QDR IB.
Can you help with the issue of local bandwidth? Nvidia sells dual host
PCI-E adapter card that effectively connects 4 GPUs in an S2050 to a
single x16 slot. When pmemd.cuda is run locally (parallel or four serial
processes), how much would this impact performance?
In other words, should we even consider these cards for a host system
intended to run pmemd.cuda?
Thanks again
Sasha
Thomas Zeiser wrote:
> On Sat, Jul 31, 2010 at 10:12:29AM -0700, Scott Le Grand wrote:
>
>> I'd definitely go for QDR between nodes.
>>
>> What's up in the air ATM is whether it's best to spread the
>> C2050s across as many nodes as possible or whether 2 or 4 C2050s
>> per node is the optimum configuration.
>>
>
> for QDR to work at full speed you need PCIe2.0 8x (not only
> mechanically but also electrically). Are there any boards which have
> four 16x slots and additionally more than only 4x electrically?
>
> At least in the typical Intel-based boards you have two 16x and one
> 4x slot when there is one chipset on the mainboard; or four 16x and
> (several) 4x slots (4x electrically although their mechanics width
> is 8x) if there are two chipsets.
>
> Thus, with these boards, the maximum feasable is full DDR speed
> using a DDR-IB card supporting 5.0 GT/s; (slightly cheaper DDR
> cards with 2.5 GT/s only operate with an effective speed similar to
> SDR ...)
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Aug 04 2010 - 21:00:05 PDT