Hello All,
We have been testing PMEMD 3.1 on a 32 cpu (16x dual Athlon nodes)
cluster with a gigabit switch. The performance we have been seeing (in
terms of scaling to larger numbers of CPUs) is a bit disappointing when
compared to the figures released for PMEMD. For example, comparing
ps/day rates for the JAC benchmark (with the specified cutoff changes,
etc) on our cluster (left column) and those presented for a 2.4GHz Xeon
cluster also with a gigabit switch (right column) gives:
athlon xeon
1cpu: 108
2cpu: 172 234
4cpu: 239 408
8cpu: 360 771
16cpu: 419 1005
32cpu: 417
In general, in terms of wall clock time, we only see a parallel speedup
(c.f. 1cpu) of about 3.3 at 8 cpus and struggle to get much past 3.9
going to higher numbers of cpus. The parallel scaling presented for
other cluster machines appears to be much better. Has anyone else
achieved good parallel speedup on beowulf systems?
Also, we are using the Portland f90 compiler and LAM in our setup - has
anyone experienced problems with this compiler or MPI library with
PMEMD?
Thanks in advance,
Stephen Titmuss
CSIRO Health Sciences and Nutrition
343 Royal Parade
Parkville, Vic. 3052
AUSTRALIA
Tel: +61 3 9662 7289
Fax: +61 3 9662 7347
Email: stephen.titmuss.csiro.au
www.csiro.au www.hsn.csiro.au
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Jan 14 2004 - 15:53:11 PST