Hi Jason,
Thanks for the reply...What is weird for me is that the Cluster is a very new Tesla cluster and the connection of the infiniband type and the compilation was carried out using the new mvapich2 library. I'm aware of all this issues like NVT to NPT and PMEMD versus Sander..Although the previous cluster was old and the connection was ethernet, the scaling was much better....If you have some clues about the most potential possible source of errors in such cases i will appreciate your help...
Cheers
________________________________
 From: Jason Swails <jason.swails.gmail.com>
To: AMBER Mailing List <amber.ambermd.org> 
Sent: Thursday, 20 September 2012 1:59 PM
Subject: Re: [AMBER] performance
 
On Wed, Sep 19, 2012 at 11:43 PM, marawan hussain
<marawanhussain.yahoo.com>wrote:
> Dear AMBER usesrs,
> I tried to monitor the performance with different libraries and number of
> processors, the results are in the follwoing table:
> Number of processor    Performance(ns/day)    MPI Library
> 2                                    0.15
>  mvapich2
> 4                                    0.21
> mvapich2
> 8                                    0.28
>  mvapich2
> 16                                  0.20
>  mvapich2
> 32                                  0.03
> mvapich2
>
>
>
> Could someone comment on this
>
/* 
http://ambermd.org/amber10.bench1.html (similar performance for pmemd
12) */
Joking aside, many things affect performance -- interconnect, compilers,
your computer system, your simulated system, sander vs. pmemd (pmemd is
much better), PME vs. GB (GB scales much better), NTP vs. NVT vs. NVE (NVE
is fastest), ig=-1 for Langevin dynamics vs. ig>0 (ig=-1 scales better for
langevin dynamics), and so on (there are a lot more).
If you can't get your calculations to scale similar to what is found on the
benchmark site, I would probably blame poor interconnect first (note it's
not just the speed of the interconnect, but also the interconnect topology
that plays a role in modulating parallel scaling).  Unfortunately, if your
cluster has a bottleneck that isn't present in the benchmark test systems,
there's no getting around that unless you're willing to replace that
hardware (and there are way to many possibilities for us to benchmark even
a representative fraction of different cluster configurations).
HTH,
Jason
-- 
Jason M. Swails
Quantum Theory Project,
University of Florida
Ph.D. Candidate
352-392-4032
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Sep 19 2012 - 23:00:02 PDT