Hi Ross,
thank you for the info !
All the requested information you can find in my latest mail to Bob.
Best,
Marek
Dne Fri, 08 May 2009 20:20:29 +0200 napsal/-a:
> VEOxpBLIhIgCmXmstP4dQesTVQN37MipVX7qR4U43bPz_xYetg0GkFET2dR6aGCHWAfTbcLzeLWVxqvCoTO_bqPlNSTKXf5BPo6nM19EFEm6gz1HliILnI2A8MORXRFpDXC72h9xT
> X-Yahoo-Newman-Property: ymail-3
> From: "Ross Walker" <ross.rosswalker.co.uk>
> To: "'AMBER Mailing List'" <amber.ambermd.org>
> References:
> <op.utfnhjmby6gvz2.pocitadlo.ujep.cz> <op.utit9aapy6gvz2.pocitadlo.ujep.cz> <03EAF58C4EAB44E6ADB7C0CFD606859D.duke.org> <007001c9ce82$9c5b9dd0$d512d970$.co.uk> <op.utixxxuby6gvz2.pocitadlo.ujep.cz> <A7B5ABCAF1904C3F8814995B128908E7.duke.org> <B8EBA851C9D34EEB9C2A5994186A66A3.duke.org> <op.utk3k9o5y6gvz2.pocitadlo.ujep.cz> <5AF439A3CD6E49E79819AC35669971DA.duke.org> <op.utmdudesy6gvz2.pocitadlo.ujep.cz> <f79359b60905081028g2ce60f2bof5c1af94a9c65c8e.mail.gmail.com>
> <op.utmhiaawy6gvz2.pocitadlo.ujep.cz>
> In-Reply-To: <op.utmhiaawy6gvz2.pocitadlo.ujep.cz>
> Subject: RE: [AMBER] Error in PMEMD run
> Date: Fri, 8 May 2009 11:11:19 -0700
> Message-ID: <01cd01c9d008$6410ace0$2c3206a0$.co.uk>
> MIME-Version: 1.0
> Content-Type: text/plain;
> charset="iso-8859-2"
> Content-Transfer-Encoding: quoted-printable
> X-Mailer: Microsoft Office Outlook 12.0
> Thread-Index: AcnQBp9uZebVnFMpTe6EXhoRXLdf6gAAK3uw
> Content-Language: en-us
> X-BeenThere: amber.ambermd.org
> X-Mailman-Version: 2.1.5
> Precedence: list
> Reply-To: AMBER Mailing List <amber.ambermd.org>
> List-Id: AMBER Mailing List <amber.ambermd.org>
> List-Unsubscribe: <http://lists.ambermd.org/mailman/listinfo/amber>,
> <mailto:amber-request.ambermd.org?subject=unsubscribe>
> List-Archive: <http://lists.ambermd.org/mailman/private/amber>
> List-Post: <mailto:amber.ambermd.org>
> List-Help: <mailto:amber-request.ambermd.org?subject=help>
> List-Subscribe: <http://lists.ambermd.org/mailman/listinfo/amber>,
> <mailto:amber-request.ambermd.org?subject=subscribe>
> Sender: amber-bounces.ambermd.org
> Errors-To: amber-bounces.ambermd.org
>
> Hi Marek,
>
> I don't think I've seen anywhere what the actual simulation you are =
> running
> is. This will have a huge effect on parallel scalability. With =
> infiniband
> and a 'reasonable' system size you should easily be able to get beyond 2
> nodes. Here are some numbers for the JAC NVE benchmark from the suite
> provided on http://ambermd.org/amber10.bench1.html
>
> This is for NCSA Abe which is Dual x Quad core clovertown (E5345 2.33GHz
> =
> so
> very similar to your setup) and uses SDR infiniband.
>
> Using all 8 processors per node (time for benchmark in seconds):
> 8 ppn 8 cpu 364.09
> 8 ppn 16 cpu 202.65
> 8 ppn 24 cpu 155.12
> 8 ppn 32 cpu 123.63
> 8 ppn 64 cpu 111.82
> 8 ppn 96 cpu 91.87
>
> Using 4 processors per node (2 per socket):
> 4 ppn 8 cpu 317.07
> 4 ppn 16 cpu 178.95
> 4 ppn 24 cpu 134.10
> 4 ppn 32 cpu 105.25
> 4 ppn 64 cpu 83.28
> 4 ppn 96 cpu 67.73
>
> As you can see it is still scaling to 96 cpus (24 nodes at 4 threads per
> node). So I think you must either be running an unreasonably small =
> system to
> expect scaling in parallel or there is something very wrong with the =
> setup
> of your computer.
>
> All the best
> Ross
>
>> -----Original Message-----
>> From: amber-bounces.ambermd.org [mailto:amber-bounces.ambermd.org] On
>> Behalf Of Marek Mal=FD
>> Sent: Friday, May 08, 2009 10:58 AM
>> To: AMBER Mailing List
>> Subject: Re: [AMBER] Error in PMEMD run
>> =20
>> Hi Gustavo,
>> =20
>> thanks for your suggestion but we have only 14 nodes in our cluster
>> (each
>> node =3D 2 x Xeon Quad-core 5365 (3,00 GHz) =3D 8 single CPUs per node
>> connected with "Cisco InfiniBand").
>> =20
>> If I allocate 8 nodes and I use just 2 CPUs per node for one my job it
>> means that 8x6 single CPUs =3D 48 will be wasted. In this
>> case I am sure that my colleagues will kill me :)) Moreover I do not
>> assume that 8/2CPU combination will have significantly better
>> performance that 2/8CPU at least in case of PMEMD.
>> =20
>> But anyway, thank you for your opinion/experience !
>> =20
>> Best,
>> =20
>> Marek
>> =20
>> =20
>> =20
>> =20
>> Dne Fri, 08 May 2009 19:28:35 +0200 Gustavo Seabra
>> <gustavo.seabra.gmail.com> napsal/-a:
>> =20
>> >> the best performance I have obtained in case of using combination =
> of
>> 4
>> >> nodes
>> >> and 4 CPUs (from 8) per node.
>> >
>> > I don't know exactly what you have in your system, but I gather you
>> > are using 8core-nodes, and from it you got the best performance by
>> > leaving 4 cores idle. Is that correct?
>> >
>> > In this case, I would suggest that you go a bit further, and also
>> test
>> > using only 1 or 2 cores per node, i.e., leaving the remaining 6-7
>> > cores idle. So, for 16 MPI processes, try allocating 16 or 8 nodes.
>> > (I didn't see this case in your tests)
>> >
>> > AFAIK, The 8-core nodes are arranged in 2 4-core sockets, and the
>> > communication between core, that was already bad within the 4-cores
>> in
>> > the same socket, gets even worse when you need to get information
>> > between two sockets. Depending on your system, if you send 2
>> processes
>> > to the same node, it may put all in the same socket or automatically
>> > split it one for each socket. You may also be able to tell it to =
> make
>> > sure that this gets split in to 1 process per socket. (Look into the
>> > mpirun flags.) From the tests we've run on those kind of machines, =
> we
>> > do get the best performance by leaving ALL BUT ONE core idle in each
>> > socket.
>> >
>> > Gustavo.
>> >
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> > __________ Informace od NOD32 4051 (20090504) __________
>> >
>> > Tato zprava byla proverena antivirovym systemem NOD32.
>> > http://www.nod32.cz
>> >
>> >
>> =20
>> --
>> Tato zpr=E1va byla vytvo=F8ena p=F8evratn=FDm po=B9tovn=EDm klientem =
> Opery:
>> http://www.opera.com/mail/
>> =20
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
> __________ Informace od NOD32 4051 (20090504) __________
>
> Tato zprava byla proverena antivirovym systemem NOD32.
> http://www.nod32.cz
>
>
--
Tato zpráva byla vytvořena převratným poštovním klientem Opery:
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed May 20 2009 - 15:13:37 PDT