Re: [AMBER] multi gpu run using MPI from Hossein Pourreza on 2019-06-11 (Amber Archive Jun 2019)

From: Hossein Pourreza <hpourreza.uchicago.edu>
Date: Tue, 11 Jun 2019 13:12:33 +0000

Many thanks Ross for your answer and clarification.

Hossein

From: Ross Walker <ross.rosswalker.co.uk>
Reply-To: AMBER Mailing List <amber.ambermd.org>
Date: Tuesday, June 11, 2019 at 8:10 AM
To: AMBER Mailing List <amber.ambermd.org>
Subject: Re: [AMBER] multi gpu run using MPI

Hi Hossein,

This is normal behavior. However don't expect much if any speedup for PME calculation spanning 2 GPUs. These days the GPUs are too fast for the interconnect to keep up. The exceptions to this are large (> 5000 atom) GB simulations, Replica Exchange and TI calculations. For regular explicit solvent MD simulations we recommend running multiple simulations from different initial conditions (structure, random seed etc), one on each GPU.

All the best
Ross

On Jun 6, 2019, at 4:35 PM, Hossein Pourreza <hpourreza.uchicago.edu<mailto:hpourreza.uchicago.edu>> wrote:
Greetings,
I am trying to run Amber 16 benchmark on a system with 4 GPUs. I compiled Amber with Intelmpi/2018 cuda/9.0. When the benchmark tries to run mpirun -np 2 $AMBERHOME/bin/pmemd.cuda.MPI … with CUDA_VISIBLE_DEVICES=0,1 (for example) I can see two processes running on each GPU (I monitor using nvidia-smi). Looks like each MPI process runs the code on both GPUs. I tried to set OMP_NUM_THREADS=1 but did not change anything. Looking at mdout.2GPU, things seem to be ok:
|------------------- GPU DEVICE INFO --------------------
|
| Task ID: 0
| CUDA_VISIBLE_DEVICES: 0,1
| CUDA Capable Devices Detected: 2
| CUDA Device ID in use: 0
| CUDA Device Name: Tesla V100-PCIE-16GB
| CUDA Device Global Mem Size: 16130 MB
| CUDA Device Num Multiprocessors: 80
| CUDA Device Core Freq: 1.38 GHz
|
|
| Task ID: 1
| CUDA_VISIBLE_DEVICES: 0,1
| CUDA Capable Devices Detected: 2
| CUDA Device ID in use: 1
| CUDA Device Name: Tesla V100-PCIE-16GB
| CUDA Device Global Mem Size: 16130 MB
| CUDA Device Num Multiprocessors: 80
| CUDA Device Core Freq: 1.38 GHz
|
|--------------------------------------------------------
|---------------- GPU PEER TO PEER INFO -----------------
|
| Peer to Peer support: ENABLED
|
|--------------------------------------------------------
I am wondering if this is the normal behavior or I am missing something here.
Many thanks
Hossein
_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org<mailto:AMBER.ambermd.org>
http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jun 11 2019 - 06:30:03 PDT