I just wanted to note, based on some other comments about performance with
eight single-threaded processes running in parallel on a 2x dual-quad box,
I have a bare-bones MD program that I used for water simulations that
loses no efficiency when running eight independent processes on a 2x
dual-quad box (I didn't write the code to parallelize). In terms of the
real-space nonbonded calculation it was about 60%-80% faster than PMEMD
(PMEMD works more efficiently with smaller time steps, but my code's
performance is the same across all time steps). If we consider the cost
of PME electrostatics the speedup was only about 30%. I was able to run
eight processes of it without losing any efficiency as compared to a
single process.
My code does not use neighbor lists, but instead relies on a quick
distance^2 calculation between water oxygens to determine what to compute,
so it is actually very light on the RAM bus. I have made recommendations
to AMBER developers about the possibility of foregoing the neighbor lists
in the case of water molecules to conserve cache (and, neighbor lists of
protein atoms would be much longer-lived anyway). With other specialized
routines for 3, 4, or 5-point water:water interactions, I would still
expect a 20% speedup during typical simulations.
There may be other improvements that can be made in PMEMD based on these
midpoint methods and other parallel efficiency algorithms that will not
only help cluster scaling but also parallel efficiency on multi-core
processors.
Dave
-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Aug 15 2007 - 06:07:51 PDT