Re: AMD opteron

From: Tru Huynh <tru.pasteur.fr>
Date: Wed, 28 May 2003 18:52:18 +0200

Hi,

Bad practice, but a followup for dual cpus performance
with LAM-MPI version 6.5.9
../configure --prefix=/home/tru/lam-6.5.9_gnu-rh --with-rsh=ssh --without-romio

MACHINE file
<cut>
setenv LAM_HOME /home/tru/lam-6.5.9_gnu-rh
setenv LAM_INCLUDE $LAM_HOME/include
setenv LAM_LIBDIR $LAM_HOME/lib
setenv MACHINE "linux/FreeBSD/Windows PC/Mac OSX"
setenv MACH Linux
setenv MACHINEFLAGS " -DMPI "
setenv CPP "cpp -traditional "
setenv CPP "/lib/cpp -traditional -I$LAM_INCLUDE"
setenv CC "hcc -m64 -O2"
setenv SYSDIR Machines/standard
setenv LOAD "hf77 -O2 -m64"
setenv LOADLIB "-lm"
setenv G77_COMPAT "-Wno-globals -fno-globals -ff90 -funix-intrinsics-hide"
setenv G77_OPT "-O2 -m64"
setenv L0 "hf77 -c -g $G77_COMPAT"
setenv L1 "hf77 -c $G77_OPT $G77_COMPAT"
setenv L2 "hf77 -c $G77_OPT $G77_COMPAT"
setenv L3 "hf77 -c $G77_OPT $G77_COMPAT"
setenv RANLIB ranlib
<cut>

> Joint CHAMM/AMBER benchmarks:

....
>
> with system gcc-3.2.2
> /home/tru/bin/sander_amber7-gnu_rh
>
> real 17m32.467s
> user 17m32.050s
> sys 0m0.140s
>
....
> ==> jac-amber7.sander_amber7-gnu_rh.20030528_11h46.opteron242.out <==
>
> | Setup wallclock 1 seconds
> | Nonsetup wallclock 1051 seconds
>
>

With 2 cpus:

real 9m49.572s
user 0m0.000s
sys 0m0.010s

| Setup wallclock 1 seconds
| Nonsetup wallclock 588 seconds

And the profile_mpi:

|>>>>>>>>Printed as average time (min,max,sd) >>>>>>>>>
| Read coords time 0.14 ( 0.00 0.28 0.14)
| Ewald setup time 0.06 ( 0.06 0.06 0.00)
| Check list validity 3.25 ( 3.24 3.25 0.00)
| Map frac coords 0.22 ( 0.22 0.22 0.00)
| Grid unit cell 0.06 ( 0.06 0.06 0.00)
| Grid image cell 0.07 ( 0.06 0.07 0.00)
| Build the list 25.66 ( 25.56 25.75 0.10)
| Other 0.12 ( 0.02 0.22 0.10)
| List time 29.43 ( 29.43 29.43 0.00)
| Direct Ewald time 288.02 ( 286.73 289.30 1.28)
| Adjust Ewald time 2.71 ( 2.67 2.75 0.04)
| Finish NB virial 1.56 ( 1.48 1.64 0.08)
| Fill Bspline coeffs 5.32 ( 5.27 5.37 0.05)
| Fill charge grid 58.19 ( 58.18 58.21 0.01)
| Scalar sum 26.63 ( 26.62 26.64 0.01)
| Grad sum 102.74 ( 102.63 102.84 0.11)
| FFT communication ti 21.05 ( 21.04 21.05 0.01)
| Other 29.70 ( 29.63 29.77 0.07)
| FFT time 50.75 ( 50.67 50.83 0.08)
| Other 0.38 ( 0.27 0.49 0.11)
| Recip Ewald time 244.01 ( 244.00 244.01 0.01)
| Ewald MPI wait 0.11 ( 0.04 0.19 0.08)
| Other 2.99 ( 1.66 4.32 1.33)
| Ewald time 539.40 ( 539.39 539.41 0.01)
| Nonbond force 568.83 ( 568.83 568.84 0.01)
| Bond energy 0.11 ( 0.11 0.11 0.00)
| Angle energy 0.76 ( 0.76 0.76 0.00)
| Dihedral energy 3.66 ( 3.63 3.70 0.03)
| FRC Collect time 3.75 ( 3.74 3.75 0.00)
| Other 1.05 ( 1.01 1.08 0.04)
| Force time 578.17 ( 578.15 578.18 0.01)
| Shake time 2.60 ( 2.49 2.71 0.11)
| Verlet update time 2.79 ( 2.68 2.90 0.11)
| CRD distribute time 3.36 ( 3.36 3.37 0.01)
| Other 0.25 ( 0.11 0.39 0.14)
| Runmd Time 587.17 ( 587.03 587.31 0.14)
| Other 1.62 ( 1.48 1.76 0.14)
| Total time 588.93 ( 588.80 589.07 0.14)

-- 
Dr Tru Huynh          | http://www.pasteur.fr/recherche/unites/Binfs/
mailto:tru.pasteur.fr | tel/fax +33 1 45 68 87 37/19
Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France  
Received on Wed May 28 2003 - 18:53:01 PDT
Custom Search