AMBER: Sander slower on 16 processors than 8

From: Sontum, Steve <>
Date: Thu, 22 Feb 2007 15:32:42 -0500

I have been trying to get decent scaling for amber calculations on our
cluster and keep running into bottlenecks. Any suggestions would be
appreciated. The following are benchmarks for the factor_ix and jac on
1-16 processors using amber8 compiled with pgi 6.0 except for the lam
runs which used pgi 6.2



mpich1 (1.2.7) factor_ix 1:928 2:518 4:318 8:240 16:442

mpich2 (1.0.5) factor_ix 1:938 2:506 4:262 8:*

mpich1 (1.2.7) jac 1:560 2:302 4:161 8:121 16:193

mpich2 (1.0.5) jac 1:554 2:294 4:151 8:111 16:181

lam (7.1.2) jac 1:516 2:264 4:142 8:118


* timed out after 3hours


First off, is it unusual for the calculation to get slower with
increased number of processes?

Does anyone have benchmarks for a similar cluster, so I can tell if
there is a problem with the configuration of our cluster? I would like
to be able to run on more than one or two nodes.



The 10 compute nodes use 2.0GHz dual core opteron 270 chips with 4GB
memory and 1Mb memory Cache, tyan 2881 motherboards, HP Procurve 2848
switch, and single 1Gb/sec Ethernet connection to each motherboard. The
master node is configured similarly but also has a 2TB of raid storage
that is automounted by the compute nodes. We are running SuSE
2.6.5-7-276-smp for the operating system. Amber8 and mpich were
compiled with pgi 6.0.


I have used ganglia to look at the nodes when a 16 process job is
running. The nodes are fully consumed by system CPU time. The User CPU
time is only 5% and this node is only pushing 1.4 kBytes/sec out over
the network




Stephen F. Sontum
Professor of Chemistry and Biochemistry
phone: 802-443-5445

The AMBER Mail Reflector
To post, send mail to
To unsubscribe, send "unsubscribe amber" to
Received on Sun Feb 25 2007 - 06:07:23 PST
Custom Search