Re: AMBER: PMEMD Performance on Beowulf systems from Viktor Hornak on 2003-12-22 (Amber Archive Dec 2003)

From: Viktor Hornak <hornak.csb.sunysb.edu>
Date: Mon, 22 Dec 2003 08:47:42 -0500

A7M266-D Dual Socket A Motherboard (AMD 762 Chipset). It has 3 32bit
33MHz PCI slots and 2 64/32bit 66/33MHz PCI slots. To get a noticable
speedup in networking, the gigabit card (Intel Pro1000) needs to be
placed into 64bit 66MHz PCI slot.

Hope this helps,
-Viktor Hornak
Stony Brook University

Aldo Jongejan wrote:

>Hi,
>
> What kind of motherboards are we talking about?!
>
> aldo
>
> Carlos Simmerling wrote:
> >
> > We had gigabit network on both our dual athlons (1.6ghz)
> > and our dual Xeons. Scaling was much worse on the athlons
> > until we found that moving the network cards (Intel) to a
> > different slot made a huge difference for the athlon motherboards.
> > You should check this to see what the PCI bandwidth is on each
> > slot- for us they were not the same.
> > Carlos
> >
> > ----- Original Message -----
> > From: "Robert Duke" <rduke.email.unc.edu>
> > To: <amber.scripps.edu>
> > Sent: Thursday, December 18, 2003 11:35 PM
> > Subject: Re: AMBER: PMEMD Performance on Beowulf systems
> >
> > > Stephen -
> > > Several points -
> > > 1) Gigabit ethernet is not particularly good for scaling. The
>numbers I
> > > published were on IBM blade clusters that had no other load on
>them, and
> > the
> > > gigabit interconnect was isolated from other net traffic. If you
>split
> > > across switches or have other things going on (ie., other jobs
>running
> > > anywhere on machines on the interconnect), performance tends to
>really
> > drop.
> > > This is all you can expect to happen from such a slow
>interconnect. A
> > real
> > > killer for dual athlons is to not take advantage of the dual
>processors;
> > > typically if you have gigabit ethernet you will get better
>performance
> > > through shared memory, and if one of the cpu's is being used for
> > something
> > > else, you can't do this.
> > > 2) LAM MPI in my hands is slower than MPICH, around 10% if I
>recollect,
> > > without extensive testing (ie., I probably only did the check on
>some
> > > athlons with a slow interconnect, but inferred that LAM was not
> > necessarily
> > > an improvement). Taking this into account, your xeon numbers are
>really
> > not
> > > very different than mine (you are 10% better at 8 cpu and 20% worse
>at 16
> > > cpu, roughly).
> > > 3) Our 1.6 GHz athlons are slower than our 2.4 GHz xeons. I like
>the
> > > athlons, but the xeons can take advantage of vectorizing sse2
> > instructions.
> > > I don't know what your athlons are, but am not surprised they are
>slower.
> > > Why they ar scaling so badly, I would suspect to be loading,
>config, net
> > > cards, mothrboards, or heaven only knows. Lots of things can be
>slow
> > (back
> > > to item 1).
> > > 4) I don't use the Portland Group compilers at all because I had
>problems
> > > with them a couple of years ago, and the company did absolutely
>nothing to
> > > help. Looked like floating point register issues. This probably
>is not
> > > still the case, but the point is that I don't know what performance
>one
> > > would expect. My numbers are from the Intel fortran compiler.
>There
> > could
> > > also be issues about how LAM was built, or MPICH if you change to
>that.
> > >
> > > You have to really bear in mind that with gigabit ethernet, you are
>at the
> > > absolute bottom of reasonable interconnects for this type of
>system, and
> > it
> > > does not take much at all for numbers to be twofold worse than the
>ones I
> > > published. My numbers are for isolated systems, good hardware,
>with the
> > mpi
> > > build carefully checked out, and with pmemd built with ifc, which
>is also
> > > well checked out.
> > >
> > > Regards - Bob Duke
> > >
> > > ----- Original Message -----
> > > From: <Stephen.Titmuss.csiro.au>
> > > To: <amber.scripps.edu>
> > > Sent: Thursday, December 18, 2003 10:19 PM
> > > Subject: AMBER: PMEMD Performance on Beowulf systems
> > >
> > >
> > > > Hello All,
> > > >
> > > > We have been testing PMEMD 3.1 on a 32 cpu (16x dual Athlon
>nodes)
> > > > cluster with a gigabit switch. The performance we have been
>seeing (in
> > > > terms of scaling to larger numbers of CPUs) is a bit
>disappointing when
> > > > compared to the figures released for PMEMD. For example,
>comparing
> > > > ps/day rates for the JAC benchmark (with the specified cutoff
>changes,
> > > > etc) on our cluster (left column) and those presented for a
>2.4GHz Xeon
> > > > cluster also with a gigabit switch (right column) gives:
> > > >
> > > > athlon xeon
> > > > 1cpu: 108
> > > > 2cpu: 172 234
> > > > 4cpu: 239 408
> > > > 8cpu: 360 771
> > > > 16cpu: 419 1005
> > > > 32cpu: 417
> > > >
> > > > In general, in terms of wall clock time, we only see a parallel
>speedup
> > > > (c.f. 1cpu) of about 3.3 at 8 cpus and struggle to get much past
>3.9
> > > > going to higher numbers of cpus. The parallel scaling presented
>for
> > > > other cluster machines appears to be much better. Has anyone
>else
> > > > achieved good parallel speedup on beowulf systems?
> > > >
> > > > Also, we are using the Portland f90 compiler and LAM in our setup
>- has
> > > > anyone experienced problems with this compiler or MPI library
>with
> > > > PMEMD?
> > > >
> > > > Thanks in advance,
> > > >
> > > > Stephen Titmuss
> > > >
> > > > CSIRO Health Sciences and Nutrition
> > > > 343 Royal Parade
> > > > Parkville, Vic. 3052
> > > > AUSTRALIA
> > > >
> > > > Tel: +61 3 9662 7289
> > > > Fax: +61 3 9662 7347
> > > > Email: stephen.titmuss.csiro.au
> > > > www.csiro.au www.hsn.csiro.au
> > > >
> > > >
>-----------------------------------------------------------------------
> > > > The AMBER Mail Reflector
> > > > To post, send mail to amber.scripps.edu
> > > > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> > > >
> > > >
> > >
> > >
> > >
> > >
>-----------------------------------------------------------------------
> > > The AMBER Mail Reflector
> > > To post, send mail to amber.scripps.edu
> > > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
> > >
> > >
> >
> >
>-----------------------------------------------------------------------
> > The AMBER Mail Reflector
> > To post, send mail to amber.scripps.edu
> > To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
>###########################################
>
>Aldo Jongejan
>Molecular Modeling Group
>Dept. of Pharmacochemistry
>Free University of Amsterdam
>De Boelelaan 1083
>1081 HV Amsterdam
>The Netherlands
>
>e-mail: jongejan.few.vu.nl
>tlf: +31 (0)20 4447612
>fax: +31 (0)20 4447610
>
>###########################################
>
>-----------------------------------------------------------------------
>The AMBER Mail Reflector
>To post, send mail to amber.scripps.edu
>To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Wed Jan 14 2004 - 15:53:11 PST