Re: [AMBER] Amber16 on K80 GPUs --poor performance on multiple GPUs from Huang Jing on 2017-01-03 (Amber Archive Jan 2017)

From: Huang Jing <jing.huang8911.gmail.com>
Date: Tue, 3 Jan 2017 18:32:23 +0200

cuda8.0 seems to have higher performance than cuda7.5,
http://developer.download.nvidia.com/compute/cuda/compute-docs/cuda-performance-report.pdf
in page 11,

jing

On Tue, Jan 3, 2017 at 6:25 PM, Daniel Roe <daniel.r.roe.gmail.com> wrote:

> Hi,
>
> See the 'Multi GPU' section in http://ambermd.org/gpus/#Running for
> some tips. In particular you need to make sure that the GPUs can run
> with direct peer-to-peer communication to get any kind of speedup for
> multi GPUs (this is printed somewhere near the top of mdout output).
>
> -Dan
>
> On Tue, Jan 3, 2017 at 11:00 AM, Susan Chacko <susanc.helix.nih.gov>
> wrote:
> > Hi all,
> >
> > I successfully built Amber 16 with Intel 2015.1.133, CUDA 7.5, and
> > OpenMPI 2.0.1. We're running Centos 6.8 and Nvidia drivers 352.39 on
> > K80x GPUs.
> >
> > I ran the benchmark suite. I'm getting approx the same results as shown
> > on the Amber16 benchmark page for CPUs and 1 GPU
> > (http://ambermd.org/gpus/benchmarks.htm)
> >
> > e.g.
> >
> > Factor IX NPT
> >
> > Intel E5-2695 v3 . 2.30GHz, 28 cores: 9.58 ns/day
> >
> > 1 K80 GPU: 31.2 ns/day
> >
> > However, when I attempt to run on 2 K80 GPUs, performance drops
> > dramatically.
> > 2 K80 GPUs: 1.19 ns/day
> >
> > I'm running the pmemd.cuda_SPFP.MPI executable like this:
> > cd Amber16_Benchmark_Suite/PME/FactorIX_production_NPT
> > mpirun -np # /usr/local/apps/amber/amber16/bin/pmemd.cuda_SPFP.MPI -O -i
> > mdin.GPU -o mdout -p prmtop -c inpcrd
> > where # is 1 or 2.
> > Each of the individual GPUs ran this benchmark at ~31.2 ns/day, so I
> > don't think there is any intrinsic problem with any of GPU hardware.
> > I get the same drop in performance with pmemd.cuda_DPFP.MPI and
> > pmemd.cuda_SPXP.MPI
> >
> > Is this expected behaviour? I don't see a benchmark for 2 or more K80s
> > on the Amber16 GPUs benchmark page, so am not sure what to expect. I
> > also see that the benchmarks on that page were run with Amber16/ Centos
> > 7 + CUDA 8.0 + MPICH 3.1.4 and are running on later versions of the
> > Nvidia drivers than we have, but I would not expect those differences to
> > account for what I'm seeing.
> >
> > Any ideas? Is it worth rebuilding with CUDA 8.0, or MPICH instead of
> > OpenMPI?
> >
> > All thoughts and suggestions much appreciated,
> > Susan.
> >
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe
> Laboratory of Computational Biology
> National Institutes of Health, NHLBI
> 5635 Fishers Ln, Rm T900
> Rockville MD, 20852
> https://www.lobos.nih.gov/lcb
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 03 2017 - 09:00:03 PST