Re: [AMBER] AMBER: MPI run issue

From: Jason Swails <jason.swails.gmail.com>
Date: Mon, 16 Feb 2015 09:15:05 -0500

> On Feb 16, 2015, at 8:38 AM, Bajarang Kumbhar <kumbharbajarang.gmail.com> wrote:
>
> Dear Sir
> I was going run the MD simulation of protein in explicit mode on the
> parallel machine with core 1024 (64nodeX16core=1024), but it shows the
> error
>
> Error: the number of processors must not be greater than 256, but is 1024
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 recv desc
> error, 128, 0x8f43a80
> [32] Abort: Got completion with error 12, vendor code=81, dest rank=
> at line 870 in file ../../ofa_poll.c
> recv desc error, 128, 0xa081380
> [512] Abort: Got completion with error 12, vendor code=81, dest rank=
> at line 870 in file ../../ofa_poll.c
> recv desc error, 128, 0x99b0c80
> [64] Abort: Got completion with error 12, vendor code=81, dest rank=
> at line 870 in file ../../ofa_poll.c
> recv desc error, 128, 0xaa95180
> [256] Abort: Got completion with error 12, vendor code=81, dest rank=
> at line 870 in file ../../ofa_poll.c
> recv desc error, 128, 0x9654f40
> [128] Abort: Got completion with error 12, vendor code=81, dest rank=
> at line 870 in file ../../ofa_poll.c
> ...................................................................................
>
> This will be fine for the 256 processor and it is running.
>
> I don't know why the issue come for using 1024 processor.. any
> suggestion please

Because the sander code does not support it. (The pmemd code does, but I would NOT recommend it).

Please run some basic benchmarks before you waste your resources. Parallel scaling depends strongly on the interconnect technology in your cluster as well as the algorithms used in the MD code to split up the workload. I can assure you that no Amber code will scale to all 1024 CPUs in your 64-node cluster. With good connectivity, I would be surprised if you continued to see improvements in *pmemd* past 128 CPUs. I would be stunned if sander scaled that well.

At some point, your CPUs will spend more time communicating with each other than calculating, and your simulation will actually run slower than it did with fewer processors. It is worth finding out when this happens on your cluster and making sure you make optimal use of your resources.

Here are some basic CPU benchmarks on somewhat older hardware that should hopefully give you some idea of what to expect: http://ambermd.org/amber10.bench1.html <http://ambermd.org/amber10.bench1.html>

HTH,
Jason

--
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Feb 16 2015 - 06:30:02 PST
Custom Search