[AMBER] pmemd.cuda.MPI vs openmpi

From: Victor Ma <victordsmagift.gmail.com>
Date: Wed, 3 Jun 2015 10:54:44 -0700

Hello Amber community,

I am testing my amber14 on a gpu cluster with IB. I noticed that when I
turn on openmpi with pmemd.cuda.MPI, it actually slows things down.
On single node, I have two gpus and 16 cpus. If I submit a job using
"pmemd.cuda.MPI -O -i .....", one gpu is 99% used and P2P support is on.
For my big system, I am getting ~27ns/day. If I turn on openmpi and use
this instead "export CUDA_VISIBLE_DEVICES=0,1 then mpirun -np 2
pmemd.cuda.MPI -O -i ....", two gpus are 77% used each but P2P is OFF. In
this case, I am getting 33 ns/day. It is faster but I suspect that it could
be even faster if the P2P is on. The other thing I tried is to run "mpirun
-np 16 pmemd.cuda.MPI -O -i ....". Here the run is slowed down to 14ns/day.
One GPU is used and all 16 cpus are used. Again p2p is off.

I downloaded the check_p2p scripts. But as I am working on a cluster and
therefore do could not run "make".

I am pretty happy with the speed I am getting but also wondering if the
configuration can be further optimized to improve performance, eg running
on 2gpus 100% with P2P on.

Thank you!

AMBER mailing list
Received on Wed Jun 03 2015 - 11:00:03 PDT
Custom Search