Re: AMBER: tru64 alpha from Robert Duke on 2003-10-16 (Amber Archive Oct 2003)

From: Robert Duke <rduke.email.unc.edu>
Date: Thu, 16 Oct 2003 21:18:41 -0400

Mu -
A lot depends on the nature of the interconnect and its load. If it is
something like gigabit ethernet, it could be slow, but at 8 nodes without
other significant load on the interconnect, I see something like 85% scaling
for a larger problem (91K atoms), and something like 82% scaling for a
smaller problem (23.5K atoms), both constant pressure, which is the worst
case. Almost everything else scales in this range or better. Lemieux at
psc scaled at 83% on 8 processors for the large problem (quadrics
interconnect, but under load from the other 2,992 processors I wasn't using;
the blade gigabit ethernet results are done standalone; otherwise they are
not particularly reproducible). In the mdout does it indicate that the CIT
code paths are being used (vs. the slower sander code paths). Perhaps there
are mpich config problems? Significant changes to the machine file? Perhaps
there is a huge load on the interconnect from other sources? What does
sander 6/7 do? Other parallel software? Given the info at hand, all I can
tell you is that that type of scaling is way worse than what you should get
from a reasonably configured system, correctly built software, and an
interconnect under reasonable loads.
Regards - Bob

----- Original Message -----
From: "Mu Yuguang (Dr)" <YGMu.ntu.edu.sg>
To: <amber.scripps.edu>
Sent: Thursday, October 16, 2003 8:40 PM
Subject: RE: AMBER: tru64 alpha

> Thanks David, Bill and Rob for your helpful reply.
> Now I try to complie PMEMD with little changed machine file, using
> mpif90 and mpicc, and then submit with corresponding mpirun.
> It works well in one node with 4 cpus with scaling up to 92%, but the
> scaling drops to 25% using 2 nodes with 8 cpus.
> My system is 18er duplex DNA with total 56999 atoms using PME.
>
> The inter-node connections should be a little better than Myrinet, and
> here the MPI is mpich-1.2.5.
> I am not sure that the scaling failure is due to the mpich or something
> else.
>
>
> -----Original Message-----
> From: Bill Ross [mailto:ross.cgl.ucsf.edu]
> Sent: Wednesday, October 15, 2003 10:39 PM
> To: amber.scripps.edu
> Subject: RE: AMBER: tru64 alpha
>
> > FATAL dynamic memory allocation error in subroutine alloc_ew_dat_mem
> > Could not allocate ipairs array!
>
> In unix,
>
> % man ulimit
>
> Bill Ross
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
>
>
> -----------------------------------------------------------------------
> The AMBER Mail Reflector
> To post, send mail to amber.scripps.edu
> To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
>
>

-----------------------------------------------------------------------
The AMBER Mail Reflector
To post, send mail to amber.scripps.edu
To unsubscribe, send "unsubscribe amber" to majordomo.scripps.edu
Received on Fri Oct 17 2003 - 02:53:01 PDT