Hello all,
I was running a protein lipid system of 133,835 atom in GPU K40 using
pmemd.cuda.MPI. The speed I am getting is considerably low. I checked with
benchmark given in GPU Support, Amber website and found some issues with
the performance. Here are the speed I am getting (in ns/day):
Factor IX (90,906 atoms), NPT
-----------------------------
Given speed in K40 cards: 68.38 (4 x K40), 51.90 (2 x K40)
Speed I am getting      : 33.96 (4 x K40, with 4 processors), 47.53 (2 x
K40 with 2 processors)
Cellulose (408,609 atoms): NPT
------------------------------
Given speed in K40 cards: 17.34 (4 x K40), 12.33 (2 x K40)
Speed I am getting      : 7.86  (4 x K40, with 4 processors), 8.66 (2 x K40
with 2 processors)
ECC was turned off in each 4 cards and boost clocks were turned on as per
'Considerations for maximizing GPU Performance' given in the website.
The issue is - 4 cards are not giving increased speed than 2 cards! While
running my system of 133,835 atom in 4 cards and 2 cards, I am getting the
following information with nvidia-smi command:
------------------+
| NVIDIA-SMI 352.39     Driver Version: 352.39
|
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr.
ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute
M. |
|===============================+======================+======================|
|   0  Tesla K40c          Off  | 0000:02:00.0     Off |
Off |
| 25%   52C    P0    94W / 235W |    607MiB / 12287MiB |     56%
Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:03:00.0     Off |
Off |
| 26%   54C    P0    91W / 235W |    678MiB / 12287MiB |     35%
Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K40c          Off  | 0000:83:00.0     Off |
Off |
| 24%   49C    P0    88W / 235W |    678MiB / 12287MiB |     31%
Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K40c          Off  | 0000:84:00.0     Off |
Off |
| 25%   50C    P0    90W / 235W |    679MiB / 12287MiB |     35%
Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU
Memory |
|  GPU       PID  Type  Process name
Usage      |
|=============================================================================|
|    0     21674    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
507MiB |
|    0     21675    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    1     21674    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    1     21675    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
578MiB |
|    2     21676    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
578MiB |
|    2     21677    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    3     21676    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    3     21677    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
578MiB |
+-----------------------------------------------------------------------------+
[midhun.localhost Sys3-25]$ nvidia-smi
Fri Jul 13 15:33:25 2018
+------------------------------------------------------+
| NVIDIA-SMI 352.39     Driver Version: 352.39
|
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr.
ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute
M. |
|===============================+======================+======================|
|   0  Tesla K40c          Off  | 0000:02:00.0     Off |
Off |
| 25%   51C    P0    63W / 235W |     23MiB / 12287MiB |      0%
Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:03:00.0     Off |
Off |
| 26%   53C    P0    63W / 235W |     23MiB / 12287MiB |      0%
Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K40c          Off  | 0000:83:00.0     Off |
Off |
| 32%   71C    P0   145W / 235W |    677MiB / 12287MiB |     88%
Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K40c          Off  | 0000:84:00.0     Off |
Off |
| 32%   71C    P0   144W / 235W |    763MiB / 12287MiB |     99%
Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU
Memory |
|  GPU       PID  Type  Process name
Usage      |
|=============================================================================|
|    2     21795    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
576MiB |
|    2     21796    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    3     21795    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
73MiB |
|    3     21796    C   ...midhun/AMBER16/amber16/bin/pmemd.cuda.MPI
663MiB |
+-----------------------------------------------------------------------------+
Why the GPU usage is like 56%,35%,31% and 35% while running in 4 GPU cards
while running in 2 cards gives 88% and 99%?
I was getting 27.11 ns/day in 2xK40 and 21.09 ns/day in 4xK40 cards. Why am
I not getting an increased speed? Please reply.
-- 
*MIDHUN K MADHU*
Ph.D Student
Dept. of Biological Sciences
IISER Bhopal
--------------------------------
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jul 13 2018 - 04:00:01 PDT