Re: [AMBER] Amber16 on K80 GPUs --poor performance on multiple GPUs

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 3 Jan 2017 11:25:24 -0500

Hi,

See the 'Multi GPU' section in http://ambermd.org/gpus/#Running for
some tips. In particular you need to make sure that the GPUs can run
with direct peer-to-peer communication to get any kind of speedup for
multi GPUs (this is printed somewhere near the top of mdout output).

-Dan

On Tue, Jan 3, 2017 at 11:00 AM, Susan Chacko <susanc.helix.nih.gov> wrote:
> Hi all,
>
> I successfully built Amber 16 with Intel 2015.1.133, CUDA 7.5, and
> OpenMPI 2.0.1. We're running Centos 6.8 and Nvidia drivers 352.39 on
> K80x GPUs.
>
> I ran the benchmark suite. I'm getting approx the same results as shown
> on the Amber16 benchmark page for CPUs and 1 GPU
> (http://ambermd.org/gpus/benchmarks.htm)
>
> e.g.
>
> Factor IX NPT
>
> Intel E5-2695 v3 . 2.30GHz, 28 cores: 9.58 ns/day
>
> 1 K80 GPU: 31.2 ns/day
>
> However, when I attempt to run on 2 K80 GPUs, performance drops
> dramatically.
> 2 K80 GPUs: 1.19 ns/day
>
> I'm running the pmemd.cuda_SPFP.MPI executable like this:
> cd Amber16_Benchmark_Suite/PME/FactorIX_production_NPT
> mpirun -np # /usr/local/apps/amber/amber16/bin/pmemd.cuda_SPFP.MPI -O -i
> mdin.GPU -o mdout -p prmtop -c inpcrd
> where # is 1 or 2.
> Each of the individual GPUs ran this benchmark at ~31.2 ns/day, so I
> don't think there is any intrinsic problem with any of GPU hardware.
> I get the same drop in performance with pmemd.cuda_DPFP.MPI and
> pmemd.cuda_SPXP.MPI
>
> Is this expected behaviour? I don't see a benchmark for 2 or more K80s
> on the Amber16 GPUs benchmark page, so am not sure what to expect. I
> also see that the benchmarks on that page were run with Amber16/ Centos
> 7 + CUDA 8.0 + MPICH 3.1.4 and are running on later versions of the
> Nvidia drivers than we have, but I would not expect those differences to
> account for what I'm seeing.
>
> Any ideas? Is it worth rebuilding with CUDA 8.0, or MPICH instead of
> OpenMPI?
>
> All thoughts and suggestions much appreciated,
> Susan.
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe
Laboratory of Computational Biology
National Institutes of Health, NHLBI
5635 Fishers Ln, Rm T900
Rockville MD, 20852
https://www.lobos.nih.gov/lcb
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 03 2017 - 08:30:03 PST
Custom Search