Re: amber on AMDs cluster from Cezary Czaplewski on 2000-06-14 (Amber Archive Jun 2000)

From: Cezary Czaplewski <czarek_at_scheraga2.chem.cornell.edu>
Date: Wed 14 Jun 2000 12:33:39 -0400 (EDT)

On Wed, 14 Jun 2000, Xavier Deupi wrote:

> So our questions are:
> 1) Do people usually use PVM or MPI?

I am using MPI on all parallel platforms :SGI, Cray T3E, IBM SP2 and
recently Linux beowulf clusters.

> 2) Using MPI, what are the usual scaling rates, going from 1 to 2 nodes?

see below

> 3) Anyone has tested this king of computation on AMD K6 ?

I think you mean AMD K7 ? I didn't try it in parallel, but serial job is
faster than on Intel : PROWAT takes 103 sec on K7 650Mhz and 137 sec on
PIII coppermine 650MHz.

4) Anything against the following command line of mpi :
        "mpirun -v -c2c -c 2 -w -O sander input_file..." ??

I am using MPICH not LAM-MPI so I cannot help you with this.

5) Any suggestions ???

I think you should try netpipe or similar benchmarks to test your network.
For our fastethernet I have up to 90Mbit/s for TCP and up to 70Mbit/s with
MPICH on top of TCP for message size bigger than 16kbytes. The latency is
around 150 ms.

Sometime ago I did PROWAT benchmark using 3 PII 350MHz connected by
fastethernet :

PII 350 512kb L2 100MHz FSB
        time time
#proc nonsetup %cpu commun.
        [s] on master [s]

1 352 99 0
2 177 95 2.1
3 125 92 2.8

I have also data for celeron 433 MHz cluster :

Celeron 433 MHz 128kb L2 cache(66MHz FSB)
        time time
#proc nonsetup %cpu commun.
        [s] on master [s]

1 263 100 0.0
2 163 87 6.7
4 84 89 3.1
6 66 81 3.3
8 54 75 3.3
10 51 72 3.3

Recently I repeat PROWAT benchmark on our new dual PIII coppermine 650 MHz
beowulf cluster also connected by fastethernet :

PIII coppermine 650 MHz 256kb L2 cache(100MHz FSB)

               time time
#proc #nodes nonsetup %cpu commun.
              [s] on master [s]

1 1 137 100 0.0
2 2 98 95 1.8
2 1 92 98 1.7
4 4 59 85 3.3
4 2 56 91 2.4
6 6 48 74 3.7
6 3 44 79 3.5
8 8 53 56 5.7
8 4 37 74 3.2
10 10 38 69 4.8
10 5 36 67 4.3
12 12 35 66 4.7
12 6 34 62 3.9
14 14 33 57 4.5
14 7 31 60 4.0
16 8 29 56 3.9

As you can see the scaling is not perfect but it is not so bad,
especially up to 8 processors on 4 nodes.

A few more details : PROWAT is Amber 4.1 benchmark: plastocyanin in water
(11585 atoms) with cut=12.0. Sander was compiled with g77 2.95.2 and MPICH
1.2.0, with the following options :

-fomit-frame-pointer -malign-double -fcaller-saves -fno-f2c
-march=pentiumpro -funix-intrinsics-hide -O6

                                czarek
Received on Wed Jun 14 2000 - 09:33:39 PDT