[AMBER] AMBER performance degradation on CPU compared to GPU

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Dmitry Suplatov <genesup.gmail.com>
Date: Wed, 13 Jun 2018 12:46:00 +0300

Dear Amber users,

I compared scalability of classical explicit-solvent MD implemented in
AMBER on GPU and CPU. You can see the resulting plot by the dropbox link
below (different colors correspond to four proteins of different size):

https://www.dropbox.com/s/zmjspl1p5ajso5s/Acceleration_EN.png?dl=0

As you can see from this plot, the optimal acceleration of the MD is
achieved on 4 nodes; however, on GPUs we can see minor increase in
productivity even on 6-8 nodes, while on CPUs the performance degrades to
zero on the same number of nodes.

My Question is: can you explain the performance degradation on large number
of CPU nodes given the stable performance on the same number of GPU nodes?

Thank you for your time,
Dmitry

Technical details:
- Amber 14 (pmemd.cuda.MPI, pmemd.MPI)
- intel15.0.3, openmpi2.1.1-icc, mkl11.1.3, cuda6.5
- 1 Node = Intel Haswell-EP E5-2697v3, 2.6 GHz (14 cores); NVidia Tesla
K40M; 64 GB; Infiniband FDR; Infiniband FDR; Gigabit Ethernet
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jun 13 2018 - 03:00:02 PDT