Dear supporters,
I'm testing AMBER14 on the EURORA . Cineca supercomputer, wich has two
nVidea Tesla K20 per node. I've launch my job with the following PBS script:
#!/bin/bash
#PBS -l walltime=30:00
#PBS -l select=1:ncpus=1:ngpus=1
#PBS -o job.out
#PBS -e job.err
#PBS -q debug
#PBS -A XXXXX
module load profile/advanced
module load autoload amber/14
nohup pmemd.cuda.MPI -O -i produc.in -o produc.out -p *.prmtop -c eq.rst -r
prod.nc
Using so only one GPU and one CPU, and obtaining a performance on my sistem
of 45ns/day. Then I try to modify my script as follow:
#!/bin/bash
#PBS -l walltime=30:00
#PBS -l select=1:ncpus=2:ngpus=2
#PBS -o job.out
#PBS -e job.err
#PBS -q debug
#PBS -A LI03p_PADME
module load profile/advanced
module load autoload amber/14
time nohup mpirun -np 2 pmemd.cuda.MPI -O -i produc.in -o produc.out -p
*.prmtop -c eq.rst -r prod.nc
Using so 2 parallel GPU's from the same node. In this manner I have
obtained a performance of 52ns/day, with a performance improvement of only
13%. Do I make any mistake? Is there a way to improve performance?
thanks in advance,
sincerely
Stefano Motta
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Feb 27 2015 - 04:00:02 PST