[AMBER] JAC benchmark tests on K20X

From: Shan-ho Tsai <tsai.hal.physast.uga.edu>
Date: Mon, 20 May 2013 15:00:16 -0400 (EDT)

Dear All,

We have Amber12 with bugfixes 1 to 15 installed
with GPU support (gcc 4.4.7 and CUDA toolkit 4.2.9)
on our Linux cluster.

We ran the GPU benchmarks available at
http://ambermd.org/gpus/benchmarks.htm
on our K20X GPU cards and got the following
observations (tests run on 1 K20X card):

1. The 2 Cellulose tests and the 2 Factor_IX tests
had comparable performance as the values reported
at the URL above. However, for a few days, the JAC
tests had very poor performance (one of such runs
is called run1 below). E.g. (ns/day values):

                      run1 run2 value_from_URLabove
JAC_PRODUCTION_NVE 12.64 81.19 89.13
JAC_PRODUCTION_NPT 60.35 67.93 71.80

These tests were run from a mounted file system and
our GPU cards have ECC turned on. That might
account for the slower timings for our run2, but
the run1 had much poorer performance.

2. Then we repeated the benchmark tests from a local
file system (hard disk on the host). The results of
all tests were compatible with the results reported
on the URL above.

Questions:
=================

1. Can a slow file system affect the JAC tests so much
more than the Cellulose and the Factor_IX tests?

2. Why is the timing reported by mdinfo and mdout
different?

For example, for run1 of the JAC_PRODUCTION_NVE test
mdinfo shows:

| Average timings for last 1000 steps:
| Elapsed(s) = 13.67 Per Step(ms) = 13.67
| ns/day = 12.64 seconds/ns = 6833.13
|
| Average timings for all steps:
| Elapsed(s) = 13.67 Per Step(ms) = 13.67
| ns/day = 12.64 seconds/ns = 6833.13



And mdout shows:

| Final Performance Info:
| -----------------------------------------------------
| Average timings for last 9000 steps:
| Elapsed(s) = 18.13 Per Step(ms) = 2.01
| ns/day = 85.77 seconds/ns = 1007.29
|
| Average timings for all steps:
| Elapsed(s) = 31.80 Per Step(ms) = 3.18
| ns/day = 54.34 seconds/ns = 1589.87
| -----------------------------------------------------

| Setup CPU time: 3.53 seconds
| NonSetup CPU time: 19.93 seconds
| Total CPU time: 23.46 seconds 0.01 hours

| Setup wall time: 18 seconds
| NonSetup wall time: 32 seconds
| Total wall time: 50 seconds 0.01 hours

Why are these two sets of timings so different for the same
run?

Thank you very much for any suggestions.

Regards,
Shan-Ho

-----------------------------
Shan-Ho Tsai
GACRC/EITS, University of Georgia, Athens GA


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon May 20 2013 - 12:30:02 PDT
Custom Search