Re: [AMBER] Slow performance of Amber12 on cluster

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 1 Jul 2016 08:03:00 -0700

Hi BM,

I saw some replies yesterday - you should check you spam folder or the mailing list archive: http://archive.ambermd.org/

Just because you have a cluster with 6 nodes doesn't mean calculations will magically scale to it. How well calculations scale depends on the size of the simulation, various settings as well as your underlying hardware, interconnect etc. For starters you are using sander. This at best scales to about 32 MPI tasks. I wouldn't expect it to scale across multiple nodes. At a minimum you should switch to pmemd.MPI. You should also check what MPI you are using - I suggest mvapich for infiniband and have your cluster admin do some bandwidth measurements to test your are getting proper performance out of your interconnect. I am assuming at a minimum this is QDR infiniband (much below that just won;t cut it anymore with modern processors).

I always suggest doing a benchmark before you start as well. Adding more nodes doesn't always give better performance. Setup a 2 or 3 minute run and run it on 1,2,3,4 and 6 nodes and check the mdinfo file for the performance and then plot the performance in ns/day vs nodes. You will then be able to find the sweet spot for your specific simulation parameters.

All the best
Ross


> On Jun 30, 2016, at 22:03, bharat gupta <bharat.85.monu.gmail.com> wrote:
>
> Hello Amber Users,
>
> I have already asked this question previously , but nobody has responded.
> So, I am posting the question again. I am running my simulations on a
> cluster with six 16 processors nodes and I am running a total of 96
> processes with -npernode 16. My system contains 76441 atoms including
> protein+ligand+water. For completing 5 ns simulation, it took around more
> than 1 day.
>
> Sample command which I use for running:
>
> $]mpirun -np 96 -machinefile machinefile sander.MPI ...
>
> This is the information for one of the nodes:
>
> ===== Processor composition =====
> Processor name : Intel(R) Xeon(R) E5-2650 v2
> Packages(sockets) : 2
> Cores : 16
> Processors(CPUs) : 16
> Cores per package : 8
> Threads per core : 1
>
> Memory info:
> [sandia.node06 ~]$ free -m
> total used free shared buffers cached
> Mem: 64454 2777 61676 95 154 265
>
>
>
> Here's the output from the run file:
>
> | Final Performance Info:
> | -----------------------------------------------------
> | Average timings for all steps:
> | Elapsed(s) = 118385.53 Per Step(ms) = 47.35
> | ns/day = 3.65 seconds/ns = 23677.11
> | -----------------------------------------------------
>
> | Job began at 10:31:31.193 on 06/27/2016
> | Setup done at 10:31:32.150 on 06/27/2016
> | Run done at 19:24:37.680 on 06/28/2016
> | wallclock() was called******** times
>
>
> I feel that the simulation is running at a very slow speed. What could be
> the reason for this or is this normal or my settings for the parallel run
> might be wrong??
>
> Thanks in advance for your suggestions.
>
> *Best Regards*
> BM
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jul 01 2016 - 08:30:02 PDT
Custom Search