Hello:
I am submitting jobs at Forge
(
https://www.xsede.org/web/guest/ncsa-forge) which use GPU and I've made
some test for a 50,000 atoms protein/water system,
command:
module load mvapich2-1.8a1p1-open64-4.5.1-cuda-4.1.28
mpirun_rsh -np ${NP} -hostfile ${PBS_NODEFILE}
/usr/apps/chemistry/Amber/amber11_1.5/bin/pmemd.cuda.MPI -O -i
prod01.in -p bm.prmtop -c eq2.rst -o prod01.out -r prod01.rst -x
prod01.mdcrd
here some results:
nodes efficiency (ns/day)
1X8 16.44
2X8 16.47
3X8 16.07
4X8 15.17
1X6 17.98
2X6 19.41
3X6 20.13
4X6 19.70
5X6 19.62
6X6 19.03
10X6 18.33
It seems that the efficiency is not so high and the best one is 3X6 with
around 20.1 ns/day. Since I am going to run hundreds of ns, it would
take such a long time to be finished.....
Does anybody got any idea how to improve the efficiency for this CUDA
running?
thank you very much
best
Albert
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Apr 17 2012 - 23:30:02 PDT