[AMBER] GPU power usage drop for igb = 8 compared to igb = 2

From: Demian Riccardi <demianriccardi.gmail.com>
Date: Wed, 3 Feb 2021 15:48:45 -0700

Hello,

In running GB on a node with four V100 gpus (I am using one gpu at a time),
I noticed a large dropoff in production for my system (300 ns/day for igb=2
down to 90 ns/day for igb = 8). The system is the same size for both,
stemming from the same pdb branching on independent tleap/min/heat/equil.
Looking into it, the power usage is noticeably less for the igb = 8
simulations (see output of nvidia-smi below). Simulations run manually on
each gpu in-turn, verified that it is the igb switch and not the gpu
itself. Is there something else I can look into?

Thanks!

Demian

Amber20 with gcc + CUDA (10.1 via nvidia-smi, and release 9.1 via nvcc
-version )

here is the output from nvidia-smi (top two are igb = 2 and bottom two are
igb =8):

Wed Feb 3 17:20:30 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.152.00 Driver Version: 418.152.00 CUDA Version: 10.1
  |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr.
ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE... Off | 00000000:3B:00.0 Off |
 0 |
| N/A 74C P0 242W / 250W | 394MiB / 16130MiB | 98%
 Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-PCIE... Off | 00000000:5E:00.0 Off |
 0 |
| N/A 66C P0 243W / 250W | 394MiB / 16130MiB | 99%
 Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-PCIE... Off | 00000000:86:00.0 Off |
 0 |
| N/A 40C P0 100W / 250W | 394MiB / 16130MiB | 100%
 Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-PCIE... Off | 00000000:AF:00.0 Off |
 0 |
| N/A 39C P0 100W / 250W | 394MiB / 16130MiB | 99%
 Default |
+-------------------------------+----------------------+----------------------+


+-----------------------------------------------------------------------------+
| Processes: GPU
Memory |
| GPU PID Type Process name Usage
   |
|=============================================================================|
| 0 16977 C pmemd.cuda
383MiB |
| 1 17197 C pmemd.cuda
383MiB |
| 2 17582 C pmemd.cuda
383MiB |
| 3 17581 C pmemd.cuda
383MiB |
+-----------------------------------------------------------------------------+
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Feb 03 2021 - 15:00:02 PST
Custom Search