Re: [AMBER] rocks cluster: MD is much slower when using more than one node

From: Matthew Zwier <mczwier.gmail.com>
Date: Sun, 3 Jun 2012 09:08:09 -0400

Hi Junmei,

This variance in throughput is also expected, as your throughput is
going to be determined by which CPUs calculate which atoms' forces;
roughly speaking, if atoms within the cut-off wind up on different
nodes, much more communication will be required, lowering your
throughput.

I suspect that a better ethernet switch will not solve this issue to
your satisfaction. You may want to consider Infiniband, which I
understand has come to a price point that makes it more palatable for
smaller-scale installations.

Cheers,
Matt Z.

On Sat, Jun 2, 2012 at 10:11 AM, Junmei Wang <junmwang.gmail.com> wrote:
> Thanks all for the replay. We will try to get a better switch. Here is more
> details for my test:
>
> 1821 protein atoms, 18284 total atoms (protein + water), PMEMD.
>
> For the same MD system, when I submitted 10 jobs (each job use one node/12
> cores), all the MD sampling rates are roughly same (about 6.5 ns/day).
> However, when I submitted 5 jobs using 2nodes/24 cores, the sampling rates
> varied from 6.8 ns/day to 0.5 ns/day. Surprisingly, when all other jobs
> were done, the last job was still very slow.
>
> Best
>
> Junmei
>
> On Sat, Jun 2, 2012 at 8:06 AM, case <case.biomaps.rutgers.edu> wrote:
>
>> On Fri, Jun 01, 2012, Junmei Wang wrote:
>> >
>> > We recently installed a rock cluster (centos) and compiled AMBER12 with
>> > mpich2. The cluster has 10 compute nodes and each has 12 CPU cores.
>> > Interestingly, the scaling is reasonable when I used CPU cores no more
>> than
>> > 12. However, when two or more nodes are involved, the computer time are
>> all
>> > longer than only using single node. For example, using 12 cores (one
>> node),
>> > I can sample 6.5 ns/day, and using 24 cores (2 nodes), I can only sample
>> > 3.5 ns/day. Although we did not use Infiniband (we used a gigabit
>> switch),
>> > this parallel performance is really a surprise to me.
>>
>> As noted, this is expected.  Be sure you are using pmemd and not sander, if
>> possible.
>>
>> ....dac
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Jun 03 2012 - 06:30:03 PDT
Custom Search