[AMBER] Amber16 on K80 GPUs --poor performance on multiple GPUs

From: Susan Chacko <susanc.helix.nih.gov>
Date: Tue, 3 Jan 2017 11:00:09 -0500

Hi all,

I successfully built Amber 16 with Intel 2015.1.133, CUDA 7.5, and
OpenMPI 2.0.1. We're running Centos 6.8 and Nvidia drivers 352.39 on
K80x GPUs.

I ran the benchmark suite. I'm getting approx the same results as shown
on the Amber16 benchmark page for CPUs and 1 GPU
(http://ambermd.org/gpus/benchmarks.htm)

e.g.

Factor IX NPT

Intel E5-2695 v3 . 2.30GHz, 28 cores: 9.58 ns/day

1 K80 GPU: 31.2 ns/day

However, when I attempt to run on 2 K80 GPUs, performance drops
dramatically.
2 K80 GPUs: 1.19 ns/day

I'm running the pmemd.cuda_SPFP.MPI executable like this:
cd Amber16_Benchmark_Suite/PME/FactorIX_production_NPT
mpirun -np # /usr/local/apps/amber/amber16/bin/pmemd.cuda_SPFP.MPI -O -i
mdin.GPU -o mdout -p prmtop -c inpcrd
where # is 1 or 2.
Each of the individual GPUs ran this benchmark at ~31.2 ns/day, so I
don't think there is any intrinsic problem with any of GPU hardware.
I get the same drop in performance with pmemd.cuda_DPFP.MPI and
pmemd.cuda_SPXP.MPI

Is this expected behaviour? I don't see a benchmark for 2 or more K80s
on the Amber16 GPUs benchmark page, so am not sure what to expect. I
also see that the benchmarks on that page were run with Amber16/ Centos
7 + CUDA 8.0 + MPICH 3.1.4 and are running on later versions of the
Nvidia drivers than we have, but I would not expect those differences to
account for what I'm seeing.

Any ideas? Is it worth rebuilding with CUDA 8.0, or MPICH instead of
OpenMPI?

All thoughts and suggestions much appreciated,
Susan.


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 03 2017 - 08:30:03 PST
Custom Search