[AMBER] AMBER performance degradation on CPU compared to GPU

From: Dmitry Suplatov <genesup.gmail.com>
Date: Wed, 13 Jun 2018 12:46:00 +0300

Dear Amber users,

I compared scalability of classical explicit-solvent MD implemented in
AMBER on GPU and CPU. You can see the resulting plot by the dropbox link
below (different colors correspond to four proteins of different size):


As you can see from this plot, the optimal acceleration of the MD is
achieved on 4 nodes; however, on GPUs we can see minor increase in
productivity even on 6-8 nodes, while on CPUs the performance degrades to
zero on the same number of nodes.

My Question is: can you explain the performance degradation on large number
of CPU nodes given the stable performance on the same number of GPU nodes?

Thank you for your time,

Technical details:
- Amber 14 (pmemd.cuda.MPI, pmemd.MPI)
- intel15.0.3, openmpi2.1.1-icc, mkl11.1.3, cuda6.5
- 1 Node = Intel Haswell-EP E5-2697v3, 2.6 GHz (14 cores); NVidia Tesla
K40M; 64 GB; Infiniband FDR; Infiniband FDR; Gigabit Ethernet
AMBER mailing list
Received on Wed Jun 13 2018 - 03:00:02 PDT
Custom Search