Dear all,
We are currently running Amber14 with openmpi 2.0.1 in our GPU cluster. The
cluster uses SGE 6.2u5p3 as a queuing system. Each node in the cluster is
equipped with two GeForce GTX 780s running with cuda-7.5. In the parallel
environment, we have set the allocation_rule to $pe_slots to keep Amber jobs
running in one node. We have no problem running pmemd.cuda with this system.
However, the pmemd.cuda.MPI cannot run very well.
We tested a system with 166K atoms, for 1-GPU job with “pmemd.cuda”, we got
~13 ns/day.
We tested the same system with 2-GPUs with “mpirun -np 2 pmemd.cuda.MPI -O”,
we got a problem.
(1) If we run the command directly in the calculation node without using the
SGE queuing system, we got ~20 ns/day.
(2) If we submit the 2-GPU jobs with the same command using our SGE queuing
system, we got ~5 ns/day.
In both cases, we are sure we have “Peer to Peer support: ENABLED” in both
out files.
The differences are in the timings section:
In the first case,
| Routine Sec %
| ------------------------------
| DataDistrib 0.03 0.06
| Nonbond 36.62 83.68
| Bond 0.00 0.00
| Angle 0.00 0.00
| Dihedral 0.00 0.00
| Shake 0.08 0.18
| RunMD 7.02 16.05
| Other 0.01 0.03
| ------------------------------
| Total 43.76
In the second case,
| Routine Sec %
| ------------------------------
| DataDistrib 27.04 27.21
| Nonbond 66.06 66.49
| Bond 0.00 0.00
| Angle 0.00 0.00
| Dihedral 0.00 0.00
| Shake 0.04 0.04
| RunMD 6.21 6.24
| Other 0.01 0.01
| ------------------------------
| Total 99.36
Kind Regards,
Yin Wang
Theoretical Chemistry
Leopold-Franzens-Universität Innsbruck
Innrain 82, 6020 Innsbruck, Austria
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
- application/pkcs7-signature attachment: smime.p7s
Received on Tue Jan 17 2017 - 03:30:03 PST