Re: [AMBER] RTX2080TI Performance

From: Ross Walker <ross.rosswalker.co.uk>
Date: Fri, 12 Oct 2018 10:27:59 -0400

I like it. :-)

On another note - the 4xRTX2080TIs all passed 50 repeats of the validation suite I have, as long as I only run two at a time in a box, so looks like they will be good to use for AMBER. :-) Now to figure out how best to cool them so we can have 4 or 8 in a single machine.

All the best
Ross

> On Oct 11, 2018, at 10:46, David Cerutti <dscerutti.gmail.com> wrote:
>
> "How's this for maxed out?"
>
> I think you just finally came up with our replacement for our website's
> "Insert clever motto here."
>
> Dave
>
>
> On Thu, Oct 11, 2018 at 7:20 AM Ross Walker <ross.rosswalker.co.uk> wrote:
>
>> Hi All,
>>
>> I finally have provisional numbers for RTX1080TI GPUs. I tested 4 GPUs
>> with the stock reference design coolers. The two central GPUs hit 93C
>> before dropping off the bus. Clearly 2080TI cooling is going to be a
>> challenge. I pulled the two middle GPUs and ran with 2 x 2080TI well spaced
>> out.
>>
>> The following are the AMBER 18 numbers with the initial Turin Tweaks. Note
>> consider these are still provisional. I still have to run the validation
>> tests to make sure the GPUs get the right answers. One thing is for sure,
>> we really are pushing the power envelope of these GPUs. How's this for
>> maxed out:
>>
>> Thu Oct 11 00:31:51 2018
>>
>> +-----------------------------------------------------------------------------+
>> | NVIDIA-SMI 410.57 Driver Version: 410.57
>> |
>>
>> |-------------------------------+----------------------+----------------------+
>> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr.
>> ECC |
>> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
>> M. |
>>
>> |===============================+======================+======================|
>> | 0 GeForce RTX 208... Off | 00000000:1A:00.0 Off |
>> N/A |
>> | 60% 81C P2 259W / 260W | 1495MiB / 10989MiB | 100%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>> | 1 GeForce RTX 208... Off | 00000000:68:00.0 Off |
>> N/A |
>> | 31% 60C P0 1W / 260W | 0MiB / 10989MiB | 0%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>>
>>
>> +-----------------------------------------------------------------------------+
>> | Processes: GPU
>> Memory |
>> | GPU PID Type Process name Usage
>> |
>>
>> |=============================================================================|
>> | 0 53398 C /usr/local/amber18/bin/pmemd.cuda
>> 1485MiB |
>>
>> +-----------------------------------------------------------------------------+
>>
>> And yes the benchmarks below are correct! The RTX2080TI is indeed
>> equivalent the 8x more expensive V100 and we do indeed get over 1
>> microsecond per day for the DHFR 4fs benchmark. :-)
>>
>> New Benchmark Suite with thread count tweak for Turing
>>
>> System Class CPU (1) CPU (4) GPU 0
>> GPU 1
>> ------------------------------- ---------------- -------- --------
>> -------- --------
>> JAC_production_NVE_4fs PME (Standard)
>> 915.37 900.75
>> JAC_production_NPT_4fs PME (Standard)
>> 846.79 836.68
>> Cellulose_production_NVE_4fs PME (Standard)
>> 68.00 67.52
>> Cellulose_production_NPT_4fs PME (Standard)
>> 64.75 64.58
>> FactorIX_production_NVE_4fs PME (Standard)
>> 320.49 319.07
>> FactorIX_production_NPT_4fs PME (Standard)
>> 299.55 295.82
>> STMV_production_NVE_4fs PME (Standard)
>> 23.44 23.36
>> STMV_production_NPT_4fs PME (Standard)
>> 21.96 21.87
>>
>> JAC_production_NVE_4fs PME (Optimized)
>> 1033.45 1012.65
>> JAC_production_NPT_4fs PME (Optimized)
>> 948.41 940.27
>> Cellulose_production_NVE_4fs PME (Optimized)
>> 72.61 72.24
>> Cellulose_production_NPT_4fs PME (Optimized)
>> 68.61 68.58
>> FactorIX_production_NVE_4fs PME (Optimized)
>> 354.46 353.35
>> FactorIX_production_NPT_4fs PME (Optimized)
>> 329.38 327.69
>> STMV_production_NVE_4fs PME (Optimized)
>> 25.45 25.39
>> STMV_production_NPT_4fs PME (Optimized)
>> 23.67 23.69
>>
>> TRPCage GB
>> 2514.65 2521.09
>> myoglobin GB
>> 1250.55 1251.71
>> nucleosome GB
>> 30.15 29.93
>>
>>
>> Original Benchmark Suite with thread count tweak for Turing
>> JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 1017.27 seconds/ns = 84.93
>> [1] 1 x GPU: | ns/day = 1017.20 seconds/ns = 84.94
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 1011.77 seconds/ns = 85.39
>> [1] 1 x GPU: | ns/day = 1005.58 seconds/ns = 85.92
>>
>> JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 940.66 seconds/ns = 91.85
>> [1] 1 x GPU: | ns/day = 947.24 seconds/ns = 91.21
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 943.47 seconds/ns = 91.58
>> [1] 1 x GPU: | ns/day = 939.12 seconds/ns = 92.00
>>
>> JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 535.28 seconds/ns = 161.41
>> [1] 1 x GPU: | ns/day = 533.23 seconds/ns = 162.03
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 536.59 seconds/ns = 161.02
>> [1] 1 x GPU: | ns/day = 527.42 seconds/ns = 163.82
>>
>> JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 485.28 seconds/ns = 178.04
>> [1] 1 x GPU: | ns/day = 486.15 seconds/ns = 177.72
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 483.66 seconds/ns = 178.64
>> [1] 1 x GPU: | ns/day = 477.95 seconds/ns = 180.77
>>
>> FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
>> -------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 195.32 seconds/ns = 442.36
>> [1] 1 x GPU: | ns/day = 199.12 seconds/ns = 433.91
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 200.88 seconds/ns = 430.11
>> [1] 1 x GPU: | ns/day = 197.50 seconds/ns = 437.47
>>
>> FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
>> -------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 182.97 seconds/ns = 472.21
>> [1] 1 x GPU: | ns/day = 184.14 seconds/ns = 469.21
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 182.33 seconds/ns = 473.88
>> [1] 1 x GPU: | ns/day = 183.67 seconds/ns = 470.41
>>
>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>> --------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 41.97 seconds/ns = 2058.82
>> [1] 1 x GPU: | ns/day = 41.79 seconds/ns = 2067.66
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 41.77 seconds/ns = 2068.26
>> [1] 1 x GPU: | ns/day = 41.55 seconds/ns = 2079.29
>>
>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>> --------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 39.28 seconds/ns = 2199.86
>> [1] 1 x GPU: | ns/day = 39.34 seconds/ns = 2196.28
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 39.23 seconds/ns = 2202.29
>> [1] 1 x GPU: | ns/day = 39.03 seconds/ns = 2213.82
>>
>> STMV_PRODUCTION_NPT - 1,067,095 atoms PME
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 24.39 seconds/ns = 3542.69
>> [1] 1 x GPU: | ns/day = 24.38 seconds/ns = 3543.43
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 24.32 seconds/ns = 3552.87
>> [1] 1 x GPU: | ns/day = 24.17 seconds/ns = 3574.82
>>
>> TRPCAGE_PRODUCTION - 304 atoms GB
>> ---------------------------------
>>
>> [0] 1 x GPU: | ns/day = 1151.69 seconds/ns = 75.02
>> [1] 1 x GPU: | ns/day = 1133.17 seconds/ns = 76.25
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 1124.11 seconds/ns = 76.86
>> [1] 1 x GPU: | ns/day = 1184.88 seconds/ns = 72.92
>>
>> MYOGLOBIN_PRODUCTION - 2,492 atoms GB
>> -------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 563.33 seconds/ns = 153.37
>> [1] 1 x GPU: | ns/day = 501.07 seconds/ns = 172.43
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 545.35 seconds/ns = 158.43
>> [1] 1 x GPU: | ns/day = 551.93 seconds/ns = 156.54
>>
>> NUCLEOSOME_PRODUCTION - 25,095 atoms GB
>> ---------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 15.52 seconds/ns = 5568.28
>> [1] 1 x GPU: | ns/day = 15.35 seconds/ns = 5628.69
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 15.40 seconds/ns = 5609.17
>> [1] 1 x GPU: | ns/day = 15.24 seconds/ns = 5669.86
>>
>>
>> All the best
>> Ross
>>
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 12 2018 - 07:30:03 PDT
Custom Search