Re: [AMBER] multi gpu run using MPI

From: Ross Walker <ross.rosswalker.co.uk>
Date: Tue, 11 Jun 2019 09:10:30 -0400

Hi Hossein,

This is normal behavior. However don't expect much if any speedup for PME calculation spanning 2 GPUs. These days the GPUs are too fast for the interconnect to keep up. The exceptions to this are large (> 5000 atom) GB simulations, Replica Exchange and TI calculations. For regular explicit solvent MD simulations we recommend running multiple simulations from different initial conditions (structure, random seed etc), one on each GPU.

All the best
Ross

> On Jun 6, 2019, at 4:35 PM, Hossein Pourreza <hpourreza.uchicago.edu> wrote:
>
> Greetings,
>
> I am trying to run Amber 16 benchmark on a system with 4 GPUs. I compiled Amber with Intelmpi/2018 cuda/9.0. When the benchmark tries to run mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI … with CUDA_VISIBLE_DEVICES=0,1 (for example) I can see two processes running on each GPU (I monitor using nvidia-smi). Looks like each MPI process runs the code on both GPUs. I tried to set OMP_NUM_THREADS=1 but did not change anything. Looking at mdout.2GPU, things seem to be ok:
> |------------------- GPU DEVICE INFO --------------------
> |
> | Task ID: 0
> | CUDA_VISIBLE_DEVICES: 0,1
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 0
> | CUDA Device Name: Tesla V100-PCIE-16GB
> | CUDA Device Global Mem Size: 16130 MB
> | CUDA Device Num Multiprocessors: 80
> | CUDA Device Core Freq: 1.38 GHz
> |
> |
> | Task ID: 1
> | CUDA_VISIBLE_DEVICES: 0,1
> | CUDA Capable Devices Detected: 2
> | CUDA Device ID in use: 1
> | CUDA Device Name: Tesla V100-PCIE-16GB
> | CUDA Device Global Mem Size: 16130 MB
> | CUDA Device Num Multiprocessors: 80
> | CUDA Device Core Freq: 1.38 GHz
> |
> |--------------------------------------------------------
>
> |---------------- GPU PEER TO PEER INFO -----------------
> |
> | Peer to Peer support: ENABLED
> |
> |--------------------------------------------------------
>
> I am wondering if this is the normal behavior or I am missing something here.
>
> Many thanks
> Hossein
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jun 11 2019 - 06:30:02 PDT
Custom Search