Re: AMBER6 on Alphas40 (fwd) from Thomas Cheatham on 2001-04-30 (Amber Archive Apr 2001)

From: Thomas Cheatham <cheatham_at_chpc.utah.edu>
Date: Mon 30 Apr 2001 10:05:37 -0700 (PDT)

> I am attaching some unbelievable results to this mail. These are
> for benchmark results of amber6/test/dhfr/mdout on Alpha - ES 40. Each of
> the four processors take much more time than a single linux 700Hz
> processor! Also slight change in numbers too. Also I have attaced the
> MACHINE file I used and the logfile. I just cannot believe these results!
> Can anyone let me know how I can increase the efficiency please?

Are you sure the machine was dedicated, i.e. there were no other jobs
running on the nodes at the time you performed the benchmarks? The time
reported by sander is wallclock time I believe now...

I used a similar MACHINE file on our local ES40 (+ quadrics) which
differed only adding the -lfmpi -lelan libraries to the LOADLIB and
changing the CPP to

setenv CPP "/lib/cpp -C -DLANGUAGE_FORTRAN "

I don't however think this would/could cause the slowdown. Are you sure
it is an ES40? Are you sure it ran on the real nodes and not the
front-end? (On our machine we start the job with prun "prun -n 4
./sander ..." and check status with rinfo)

The times I see are appended (along with some other machines)

Thomas E. Cheatham, III
Department of Medicinal Chemistry & Center for High Performance Computing
University of Utah INSCC 418
30 South 2000 East, Room 201 155 South 1452 East
Salt Lake City, Utah 84112-5820 Salt Lake City, Utah 84112-0190
http://www.chpc.utah.edu/~cheatham
phone: (801) 587-9652 FAX: (801) 585-5366

-----------

(Note: your mileage may vary; timings are with various versions of AMBER
6.0 and/or development versions)

                                      nonsetup speedup efficiency
Compaq ES40 + quadrics
(sierra)
                  32 nodes 16.66 7.08 22.1%
                  16 nodes 18.48 6.39 39.9%
                   8 nodes 24.18 4.88 61.0%
                   4 nodes 36.02 3.28 82.0%
                   2 nodes 64.92 1.82 91.0%
                   1 node 118.02

SP-2
IBM SP-2 . CHPC, 120 MHz
                   8 nodes 109.4 5.81 72.6%
                   4 nodes 177.9 3.57 89.0%
                   2 nodes 331.7 1.92 96.0%
                   1 node 635.6

IBM SP-2 . CHPC, 160 MHz
                   8 nodes 77.4 6.27 78.3%
                   4 nodes 136.7 3.55 88.8%
                   2 nodes 254.1 1.91 95.5%
                   1 node 485.5

icebox.chpc.utah.edu
Athlon 950 MHz, 100 base T interconnect
                  16 nodes 88.4 2.55 15.9%
                   8 nodes 95.5 2.36 29.5%
                   4 nodes 125.3 1.80 45.0%
                   2 nodes 166.8 1.35 67.5%
                   1 node 225.3

Athlon 950 MHz, giganet interconnect
                  16 nodes 35.9 6.26 39.1%
                   8 nodes 46.9 4.79 59.9%
                   4 nodes 73.8 3.05 76.2%
                   2 nodes 125.8 1.79 89.5%
                   1 node 225.0

1 GHz Athlon 1 node 227.1
1.2GHz Athlon 1 node 178.0
1.33 Athlon 1 node 165.2

s700 nodes, old a7 (700 Mhz Athlon, mpich)
                   8 nodes 108.86
                   4 nodes 152.82
                   2 nodes 203.20
                   1 node 288.15

s700 nodes, old SOFTWARE VIA (700 Mhz Athlon, mpich + software via)
                   8 nodes 91.00
                   4 nodes 128.25
                   2 nodes 183.31
                   1 node 290.65
Received on Mon Apr 30 2001 - 10:05:37 PDT