Re: [AMBER] Provisional RTX2080 AMBER 18 Performance Numbers

From: Ross Walker <ross.rosswalker.co.uk>
Date: Wed, 20 Nov 2019 09:53:46 -0700

Hi Charles,

Yes that's to be expected. Most calculations will not scale to multiple gpus anymore. The GPUs are simply too fast for the interconnect between them to keep up. The exception is large scale implicit solvent (GB) calculations. Note NVLINK still won't help since the FFT's still have to be done on a single GPU.

The design of pmemd on GPUs is such that the CPU is not required (except for I/O) and so you can run multiple calculations on the same server on different GPUs without leading to slow down. Thus you can either run two calculations of the same system from different starting structures or random seeds or you can run completely independent simulations or REMD runs (which have very little communication needs between the individual simulations), TI windows etc. See the following page for some info on how to control which GPU is used.

http://ambermd.org/GPUHowTo.php <http://ambermd.org/GPUHowTo.php>

All the best
Ross

> On Nov 19, 2019, at 3:49 PM, Charles-Alexandre Mattelaer <camattelaer01.gmail.com> wrote:
>
> Hi all
>
> I noticed that the multi gpu (P2P) runs had their perfomance reduced by
> 1/3-2/3: e.g. (JAC_PRODUCTION_NVE)
> "[7] 1 x GPU: | ns/day = 731.18 seconds/ns = 118.17
> P2P 2 x GPU: | ns/day = 293.81 seconds/ns = 294.07"
> (This is also consistent for the other simulations.)
>
> If I interpret correctly this went from 731 ns/day in a single GPU run to
> 293 ns/day in multi-GPU mode. I searched online and it appears RTX cards
> only support P2P through NVLINK bridges. This additionally implies that in
> a pc with 8 GPUs you could only create 4 pools of 2 cards that can
> communicate with each other.
>
> Can this problem be overcome for simulations like REMD calculations which
> depend on that P2P access and more than 2 cards are necessary?
>
> Kind regards
>
> Charles-Alexandre Mattelaer
>
> Op za 6 okt. 2018 23:03 schreef Ross Walker <ross.rosswalker.co.uk>:
>
>> Hi All,
>>
>> So the bad news is that even the blower style cards, at least the ones
>> I've tried from Zotac and PNY, suffer from the same overheating issues.
>>
>>
>> http://www.pny.com/GeForce-RTX-2080-8GB-Blower?sku=VCG20808BLMPB&CURRENT_NAV_ID=0ced1441-b66e-42b4-b29e-58c173c8b6f2
>> <
>> http://www.pny.com/GeForce-RTX-2080-8GB-Blower?sku=VCG20808BLMPB&CURRENT_NAV_ID=0ced1441-b66e-42b4-b29e-58c173c8b6f2
>>>
>>
>> Some of it comes from the fact the back of the card itself is actually
>> blocked and the rest is from the fact that this Turing architecture just
>> looks to be very inefficient from a power perspective and so these cards
>> run really hot. I think that's partly why NVIDIA also upped the temperature
>> limit for these cards from 83C to 87C.
>>
>> The good news is that it looks like one can get the blower style cards to
>> behave in 4 and 8 way configs by engineering some modifications to the
>> cooling system. The following is the benchmarks for a 2U x 4 RTX2080 box
>> from Exxact with some custom modifications.
>>
>> Multiple Single GPU Run Performance (4 independent runs at the same time)
>> [0] 1 x GPU: | ns/day = 751.76 seconds/ns = 114.93
>> [1] 1 x GPU: | ns/day = 757.96 seconds/ns = 113.99
>> [2] 1 x GPU: | ns/day = 756.00 seconds/ns = 114.29
>> [3] 1 x GPU: | ns/day = 746.54 seconds/ns = 115.73
>>
>> Compared what one sees with the stock cooling
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 761.27 seconds/ns = 113.49
>> [1] 1 x GPU: | ns/day = 676.54 seconds/ns = 127.71
>> [2] 1 x GPU: | ns/day = 649.64 seconds/ns = 133.00
>> [3] 1 x GPU: | ns/day = 441.83 seconds/ns = 195.55
>>
>> The temperature profiles look much better as well with the card fans
>> staying under 60%.
>>
>> Sat Oct 6 20:59:28 2018
>>
>> +-----------------------------------------------------------------------------+
>> | NVIDIA-SMI 410.57 Driver Version: 410.57
>> |
>>
>> |-------------------------------+----------------------+----------------------+
>> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr.
>> ECC |
>> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
>> M. |
>>
>> |===============================+======================+======================|
>> | 0 GeForce RTX 2080 Off | 00000000:3B:00.0 Off |
>> N/A |
>> | 41% 75C P2 194W / 215W | 281MiB / 7952MiB | 93%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>> | 1 GeForce RTX 2080 Off | 00000000:5E:00.0 Off |
>> N/A |
>> | 51% 83C P2 187W / 215W | 281MiB / 7952MiB | 93%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>> | 2 GeForce RTX 2080 Off | 00000000:AF:00.0 Off |
>> N/A |
>> | 47% 81C P2 191W / 215W | 281MiB / 7952MiB | 94%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>> | 3 GeForce RTX 2080 Off | 00000000:D8:00.0 Off |
>> N/A |
>> | 43% 76C P2 190W / 215W | 281MiB / 7952MiB | 93%
>> Default |
>>
>> +-------------------------------+----------------------+----------------------+
>>
>> Similar modifications are working for 8 way GPU boxes with modified ducted
>> cooling as well.
>>
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 740.50 seconds/ns = 116.68
>> [1] 1 x GPU: | ns/day = 735.07 seconds/ns = 117.54
>> [2] 1 x GPU: | ns/day = 732.33 seconds/ns = 117.98
>> [3] 1 x GPU: | ns/day = 740.79 seconds/ns = 116.63
>> [4] 1 x GPU: | ns/day = 735.91 seconds/ns = 117.41
>> [5] 1 x GPU: | ns/day = 727.62 seconds/ns = 118.74
>> [6] 1 x GPU: | ns/day = 740.51 seconds/ns = 116.68
>> [7] 1 x GPU: | ns/day = 731.18 seconds/ns = 118.17
>>
>> I've included the full benchmark suite for an 8 way box with custom
>> cooling below.
>>
>> So the good news is that there is at least have a solution for 2U and 4U
>> rack mount systems. We are going to work on developing a solution for 4 GPU
>> workstations as well next. So it looks like it will be possible to have 4
>> and 8 way systems with 2080s. The same solution should work for 2080TIs as
>> well. Exxact are now in the process of validating their cooling solution -
>> running it through hundreds of cycles to make sure that it is reliable and
>> they are comfortable warrantying 4 and 8 way 2080 and 2080TI systems with
>> it.
>>
>> More info to follow on desktop solutions shortly.
>>
>> All the best
>> Ross
>>
>> Full 8 x RTX2080, custom cooling, AMBER 18 benchmarks (provisional based
>> on unoptimized code).
>>
>> JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 746.03 seconds/ns = 115.81
>> [1] 1 x GPU: | ns/day = 738.50 seconds/ns = 116.99
>> [2] 1 x GPU: | ns/day = 736.95 seconds/ns = 117.24
>> [3] 1 x GPU: | ns/day = 746.34 seconds/ns = 115.76
>> [4] 1 x GPU: | ns/day = 742.80 seconds/ns = 116.32
>> [5] 1 x GPU: | ns/day = 741.27 seconds/ns = 116.56
>> [6] 1 x GPU: | ns/day = 752.41 seconds/ns = 114.83
>> [7] 1 x GPU: | ns/day = 747.56 seconds/ns = 115.58
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 740.50 seconds/ns = 116.68
>> [1] 1 x GPU: | ns/day = 735.07 seconds/ns = 117.54
>> [2] 1 x GPU: | ns/day = 732.33 seconds/ns = 117.98
>> [3] 1 x GPU: | ns/day = 740.79 seconds/ns = 116.63
>> [4] 1 x GPU: | ns/day = 735.91 seconds/ns = 117.41
>> [5] 1 x GPU: | ns/day = 727.62 seconds/ns = 118.74
>> [6] 1 x GPU: | ns/day = 740.51 seconds/ns = 116.68
>> [7] 1 x GPU: | ns/day = 731.18 seconds/ns = 118.17
>> P2P 2 x GPU: | ns/day = 293.81 seconds/ns = 294.07
>>
>> JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 685.88 seconds/ns = 125.97
>> [1] 1 x GPU: | ns/day = 674.67 seconds/ns = 128.06
>> [2] 1 x GPU: | ns/day = 684.28 seconds/ns = 126.26
>> [3] 1 x GPU: | ns/day = 681.65 seconds/ns = 126.75
>> [4] 1 x GPU: | ns/day = 690.80 seconds/ns = 125.07
>> [5] 1 x GPU: | ns/day = 673.36 seconds/ns = 128.31
>> [6] 1 x GPU: | ns/day = 704.70 seconds/ns = 122.61
>> [7] 1 x GPU: | ns/day = 695.67 seconds/ns = 124.20
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 678.35 seconds/ns = 127.37
>> [1] 1 x GPU: | ns/day = 672.89 seconds/ns = 128.40
>> [2] 1 x GPU: | ns/day = 676.27 seconds/ns = 127.76
>> [3] 1 x GPU: | ns/day = 672.19 seconds/ns = 128.54
>> [4] 1 x GPU: | ns/day = 679.72 seconds/ns = 127.11
>> [5] 1 x GPU: | ns/day = 671.59 seconds/ns = 128.65
>> [6] 1 x GPU: | ns/day = 693.96 seconds/ns = 124.50
>> [7] 1 x GPU: | ns/day = 677.78 seconds/ns = 127.48
>> P2P 2 x GPU: | ns/day = 279.56 seconds/ns = 309.05
>>
>> JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 389.66 seconds/ns = 221.73
>> [1] 1 x GPU: | ns/day = 384.71 seconds/ns = 224.59
>> [2] 1 x GPU: | ns/day = 387.89 seconds/ns = 222.74
>> [3] 1 x GPU: | ns/day = 389.16 seconds/ns = 222.02
>> [4] 1 x GPU: | ns/day = 390.47 seconds/ns = 221.27
>> [5] 1 x GPU: | ns/day = 387.86 seconds/ns = 222.76
>> [6] 1 x GPU: | ns/day = 397.82 seconds/ns = 217.18
>> [7] 1 x GPU: | ns/day = 393.03 seconds/ns = 219.83
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 391.46 seconds/ns = 220.71
>> [1] 1 x GPU: | ns/day = 386.02 seconds/ns = 223.82
>> [2] 1 x GPU: | ns/day = 385.94 seconds/ns = 223.87
>> [3] 1 x GPU: | ns/day = 390.32 seconds/ns = 221.36
>> [4] 1 x GPU: | ns/day = 388.60 seconds/ns = 222.34
>> [5] 1 x GPU: | ns/day = 385.23 seconds/ns = 224.28
>> [6] 1 x GPU: | ns/day = 393.18 seconds/ns = 219.75
>> [7] 1 x GPU: | ns/day = 386.90 seconds/ns = 223.31
>> P2P 2 x GPU: | ns/day = 134.53 seconds/ns = 642.25
>>
>> JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 350.16 seconds/ns = 246.75
>> [1] 1 x GPU: | ns/day = 346.32 seconds/ns = 249.48
>> [2] 1 x GPU: | ns/day = 351.08 seconds/ns = 246.10
>> [3] 1 x GPU: | ns/day = 350.59 seconds/ns = 246.44
>> [4] 1 x GPU: | ns/day = 355.18 seconds/ns = 243.26
>> [5] 1 x GPU: | ns/day = 347.68 seconds/ns = 248.51
>> [6] 1 x GPU: | ns/day = 363.04 seconds/ns = 237.99
>> [7] 1 x GPU: | ns/day = 358.99 seconds/ns = 240.67
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 350.74 seconds/ns = 246.34
>> [1] 1 x GPU: | ns/day = 346.90 seconds/ns = 249.06
>> [2] 1 x GPU: | ns/day = 347.75 seconds/ns = 248.45
>> [3] 1 x GPU: | ns/day = 346.19 seconds/ns = 249.57
>> [4] 1 x GPU: | ns/day = 349.31 seconds/ns = 247.34
>> [5] 1 x GPU: | ns/day = 344.34 seconds/ns = 250.91
>> [6] 1 x GPU: | ns/day = 355.19 seconds/ns = 243.25
>> [7] 1 x GPU: | ns/day = 347.95 seconds/ns = 248.31
>> P2P 2 x GPU: | ns/day = 127.09 seconds/ns = 679.81
>>
>> FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
>> -------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 130.56 seconds/ns = 661.75
>> [1] 1 x GPU: | ns/day = 129.90 seconds/ns = 665.15
>> [2] 1 x GPU: | ns/day = 129.49 seconds/ns = 667.21
>> [3] 1 x GPU: | ns/day = 130.94 seconds/ns = 659.84
>> [4] 1 x GPU: | ns/day = 127.92 seconds/ns = 675.41
>> [5] 1 x GPU: | ns/day = 130.37 seconds/ns = 662.73
>> [6] 1 x GPU: | ns/day = 130.11 seconds/ns = 664.05
>> [7] 1 x GPU: | ns/day = 131.02 seconds/ns = 659.42
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 128.70 seconds/ns = 671.31
>> [1] 1 x GPU: | ns/day = 129.89 seconds/ns = 665.17
>> [2] 1 x GPU: | ns/day = 128.84 seconds/ns = 670.59
>> [3] 1 x GPU: | ns/day = 128.47 seconds/ns = 672.55
>> [4] 1 x GPU: | ns/day = 129.99 seconds/ns = 664.68
>> [5] 1 x GPU: | ns/day = 129.69 seconds/ns = 666.21
>> [6] 1 x GPU: | ns/day = 131.34 seconds/ns = 657.84
>> [7] 1 x GPU: | ns/day = 128.03 seconds/ns = 674.83
>> P2P 2 x GPU: | ns/day = 40.02 seconds/ns = 2159.07
>>
>> FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
>> -------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 121.37 seconds/ns = 711.88
>> [1] 1 x GPU: | ns/day = 120.97 seconds/ns = 714.23
>> [2] 1 x GPU: | ns/day = 120.10 seconds/ns = 719.41
>> [3] 1 x GPU: | ns/day = 121.36 seconds/ns = 711.92
>> [4] 1 x GPU: | ns/day = 121.04 seconds/ns = 713.82
>> [5] 1 x GPU: | ns/day = 121.58 seconds/ns = 710.64
>> [6] 1 x GPU: | ns/day = 119.98 seconds/ns = 720.10
>> [7] 1 x GPU: | ns/day = 121.99 seconds/ns = 708.25
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 120.80 seconds/ns = 715.24
>> [1] 1 x GPU: | ns/day = 120.36 seconds/ns = 717.83
>> [2] 1 x GPU: | ns/day = 119.14 seconds/ns = 725.18
>> [3] 1 x GPU: | ns/day = 120.51 seconds/ns = 716.96
>> [4] 1 x GPU: | ns/day = 120.38 seconds/ns = 717.71
>> [5] 1 x GPU: | ns/day = 118.56 seconds/ns = 728.76
>> [6] 1 x GPU: | ns/day = 121.09 seconds/ns = 713.54
>> [7] 1 x GPU: | ns/day = 118.88 seconds/ns = 726.80
>> P2P 2 x GPU: | ns/day = 37.35 seconds/ns = 2313.31
>>
>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>> --------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 26.50 seconds/ns = 3260.14
>> [1] 1 x GPU: | ns/day = 26.35 seconds/ns = 3278.32
>> [2] 1 x GPU: | ns/day = 26.34 seconds/ns = 3279.82
>> [3] 1 x GPU: | ns/day = 26.50 seconds/ns = 3260.27
>> [4] 1 x GPU: | ns/day = 26.48 seconds/ns = 3263.39
>> [5] 1 x GPU: | ns/day = 26.47 seconds/ns = 3264.55
>> [6] 1 x GPU: | ns/day = 26.71 seconds/ns = 3234.66
>> [7] 1 x GPU: | ns/day = 26.61 seconds/ns = 3247.45
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 26.45 seconds/ns = 3266.38
>> [1] 1 x GPU: | ns/day = 26.27 seconds/ns = 3289.26
>> [2] 1 x GPU: | ns/day = 26.33 seconds/ns = 3281.14
>> [3] 1 x GPU: | ns/day = 26.50 seconds/ns = 3260.78
>> [4] 1 x GPU: | ns/day = 26.37 seconds/ns = 3276.69
>> [5] 1 x GPU: | ns/day = 26.35 seconds/ns = 3278.93
>> [6] 1 x GPU: | ns/day = 26.54 seconds/ns = 3254.91
>> [7] 1 x GPU: | ns/day = 26.31 seconds/ns = 3284.12
>> P2P 2 x GPU: | ns/day = 9.87 seconds/ns = 8752.51
>>
>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>> --------------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 24.91 seconds/ns = 3467.82
>> [1] 1 x GPU: | ns/day = 24.83 seconds/ns = 3479.78
>> [2] 1 x GPU: | ns/day = 24.76 seconds/ns = 3489.80
>> [3] 1 x GPU: | ns/day = 24.90 seconds/ns = 3470.09
>> [4] 1 x GPU: | ns/day = 24.87 seconds/ns = 3474.54
>> [5] 1 x GPU: | ns/day = 24.86 seconds/ns = 3475.68
>> [6] 1 x GPU: | ns/day = 25.12 seconds/ns = 3440.00
>> [7] 1 x GPU: | ns/day = 24.90 seconds/ns = 3470.55
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 24.72 seconds/ns = 3495.02
>> [1] 1 x GPU: | ns/day = 24.64 seconds/ns = 3506.48
>> [2] 1 x GPU: | ns/day = 24.74 seconds/ns = 3492.48
>> [3] 1 x GPU: | ns/day = 24.78 seconds/ns = 3486.64
>> [4] 1 x GPU: | ns/day = 24.84 seconds/ns = 3478.01
>> [5] 1 x GPU: | ns/day = 24.71 seconds/ns = 3497.10
>> [6] 1 x GPU: | ns/day = 24.87 seconds/ns = 3474.52
>> [7] 1 x GPU: | ns/day = 24.68 seconds/ns = 3500.98
>> P2P 2 x GPU: | ns/day = 9.39 seconds/ns = 9197.06
>>
>> STMV_PRODUCTION_NPT - 1,067,095 atoms PME
>> -----------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 15.81 seconds/ns = 5465.66
>> [1] 1 x GPU: | ns/day = 15.70 seconds/ns = 5502.43
>> [2] 1 x GPU: | ns/day = 15.81 seconds/ns = 5466.32
>> [3] 1 x GPU: | ns/day = 15.80 seconds/ns = 5468.46
>> [4] 1 x GPU: | ns/day = 15.89 seconds/ns = 5437.10
>> [5] 1 x GPU: | ns/day = 15.73 seconds/ns = 5491.43
>> [6] 1 x GPU: | ns/day = 16.09 seconds/ns = 5370.90
>> [7] 1 x GPU: | ns/day = 15.95 seconds/ns = 5415.91
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 15.79 seconds/ns = 5470.36
>> [1] 1 x GPU: | ns/day = 15.74 seconds/ns = 5488.62
>> [2] 1 x GPU: | ns/day = 15.78 seconds/ns = 5473.78
>> [3] 1 x GPU: | ns/day = 15.74 seconds/ns = 5490.69
>> [4] 1 x GPU: | ns/day = 15.84 seconds/ns = 5454.22
>> [5] 1 x GPU: | ns/day = 15.67 seconds/ns = 5513.70
>> [6] 1 x GPU: | ns/day = 16.01 seconds/ns = 5395.54
>> [7] 1 x GPU: | ns/day = 15.80 seconds/ns = 5467.60
>> P2P 2 x GPU: | ns/day = 5.54 seconds/ns = 15605.90
>>
>> TRPCAGE_PRODUCTION - 304 atoms GB
>> ---------------------------------
>>
>> [0] 1 x GPU: | ns/day = 1132.06 seconds/ns = 76.32
>> [1] 1 x GPU: | ns/day = 1118.26 seconds/ns = 77.26
>> [2] 1 x GPU: | ns/day = 1095.00 seconds/ns = 78.90
>> [3] 1 x GPU: | ns/day = 1151.93 seconds/ns = 75.00
>> [4] 1 x GPU: | ns/day = 1110.95 seconds/ns = 77.77
>> [5] 1 x GPU: | ns/day = 1132.09 seconds/ns = 76.32
>> [6] 1 x GPU: | ns/day = 1115.33 seconds/ns = 77.47
>> [7] 1 x GPU: | ns/day = 1163.87 seconds/ns = 74.24
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 1059.02 seconds/ns = 81.59
>> [1] 1 x GPU: | ns/day = 1068.63 seconds/ns = 80.85
>> [2] 1 x GPU: | ns/day = 1068.58 seconds/ns = 80.86
>> [3] 1 x GPU: | ns/day = 1112.72 seconds/ns = 77.65
>> [4] 1 x GPU: | ns/day = 1103.23 seconds/ns = 78.32
>> [5] 1 x GPU: | ns/day = 1127.52 seconds/ns = 76.63
>> [6] 1 x GPU: | ns/day = 1105.55 seconds/ns = 78.15
>> [7] 1 x GPU: | ns/day = 1127.24 seconds/ns = 76.65
>>
>> MYOGLOBIN_PRODUCTION - 2,492 atoms GB
>> -------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 475.00 seconds/ns = 181.90
>> [1] 1 x GPU: | ns/day = 463.83 seconds/ns = 186.27
>> [2] 1 x GPU: | ns/day = 449.39 seconds/ns = 192.26
>> [3] 1 x GPU: | ns/day = 480.88 seconds/ns = 179.67
>> [4] 1 x GPU: | ns/day = 455.23 seconds/ns = 189.80
>> [5] 1 x GPU: | ns/day = 452.17 seconds/ns = 191.08
>> [6] 1 x GPU: | ns/day = 465.24 seconds/ns = 185.71
>> [7] 1 x GPU: | ns/day = 492.00 seconds/ns = 175.61
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 463.47 seconds/ns = 186.42
>> [1] 1 x GPU: | ns/day = 459.21 seconds/ns = 188.15
>> [2] 1 x GPU: | ns/day = 478.18 seconds/ns = 180.68
>> [3] 1 x GPU: | ns/day = 454.72 seconds/ns = 190.01
>> [4] 1 x GPU: | ns/day = 451.66 seconds/ns = 191.29
>> [5] 1 x GPU: | ns/day = 470.24 seconds/ns = 183.74
>> [6] 1 x GPU: | ns/day = 491.28 seconds/ns = 175.87
>> [7] 1 x GPU: | ns/day = 465.40 seconds/ns = 185.65
>> P2P 2 x GPU: | ns/day = 322.11 seconds/ns = 268.23
>>
>> NUCLEOSOME_PRODUCTION - 25,095 atoms GB
>> ---------------------------------------
>>
>> [0] 1 x GPU: | ns/day = 10.79 seconds/ns = 8005.01
>> [1] 1 x GPU: | ns/day = 10.79 seconds/ns = 8009.92
>> [2] 1 x GPU: | ns/day = 10.83 seconds/ns = 7979.11
>> [3] 1 x GPU: | ns/day = 10.83 seconds/ns = 7974.37
>> [4] 1 x GPU: | ns/day = 10.90 seconds/ns = 7927.45
>> [5] 1 x GPU: | ns/day = 10.81 seconds/ns = 7992.27
>> [6] 1 x GPU: | ns/day = 11.08 seconds/ns = 7800.67
>> [7] 1 x GPU: | ns/day = 10.90 seconds/ns = 7923.23
>> Multiple Single GPU Run Performance
>> [0] 1 x GPU: | ns/day = 10.81 seconds/ns = 7994.29
>> [1] 1 x GPU: | ns/day = 10.83 seconds/ns = 7979.59
>> [2] 1 x GPU: | ns/day = 10.81 seconds/ns = 7995.05
>> [3] 1 x GPU: | ns/day = 10.86 seconds/ns = 7956.97
>> [4] 1 x GPU: | ns/day = 10.83 seconds/ns = 7974.77
>> [5] 1 x GPU: | ns/day = 10.79 seconds/ns = 8004.61
>> [6] 1 x GPU: | ns/day = 11.07 seconds/ns = 7807.26
>> [7] 1 x GPU: | ns/day = 10.83 seconds/ns = 7975.91
>> P2P 2 x GPU: | ns/day = 17.69 seconds/ns = 4883.43
>> P2P 4 x GPU: | ns/day = 26.90 seconds/ns = 3211.48
>> P2P 8 x GPU: | ns/day = 34.05 seconds/ns = 2537.16
>> Multiple 2xGPU Run Performance
>> [0,1] 2 x GPU: | ns/day = 17.75 seconds/ns =
>> 4867.76
>> [2,3] 2 x GPU: | ns/day = 18.27 seconds/ns =
>> 4728.22
>> [4,5] 2 x GPU: | ns/day = 17.18 seconds/ns =
>> 5030.17
>> [6,7] 2 x GPU: | ns/day = 18.26 seconds/ns =
>> 4730.82
>>
>>
>>> On Oct 5, 2018, at 09:39, Ross Walker <ross.rosswalker.co.uk> wrote:
>>>
>>> Hi Marek,
>>>
>>> My expectation for 2080TI is ~25 to 30% over 1080TI. I should have some
>> firm numbers in about 2 to 3 weeks. Note the cooling design is the same so
>> the founders/reference design cards will have the same issues with multiple
>> cards in a box as the RTX2080 and will need custom cooling solutions.
>>>
>>> In terms of performance per dollar things do not look good due to
>> NVIDIA's price inflation. You are looking at a 70+% increase in price for a
>> ~30% increase in performance, an increase one historically has always got
>> for free with each new generation of hardware so I would not exactly call
>> the RTX2080TI a technological improvement over the 1080TI. In real terms it
>> is a step backwards by around 4 years.
>>>
>>> All the best
>>> Ross
>>>
>>>> On Oct 5, 2018, at 09:17, Marek Maly <marek.maly.ujep.cz> wrote:
>>>>
>>>> Hi Ross,
>>>>
>>>> thanks a lot for this first RTX2080/Amber18 tests and important
>> technical comments.
>>>>
>>>> I think that your "1080Ti vs 2080/Amber18" results are rather in good
>> agreement
>>>> with comparison of these two GPUs on
>>>>
>>>> https://www.videocardbenchmark.net/high_end_gpus.html
>>>>
>>>> But what I really look forward to (also because we are planning some
>> HW updates ...) are Amber18 benchmarks with "RTX 2080 Ti", where seems to
>> be higher difference comparing to 1080 Ti and also lets hope that in this
>> actual "GTX Top model" will be also cooling a bit more satisfying...
>>>>
>>>> When do you think could be "2080 Ti/Amber18" provisional benchmarks
>> available ?
>>>>
>>>> Best wishes,
>>>>
>>>> Marek
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Dne Fri, 05 Oct 2018 14:32:07 +0200 Ross Walker <ross.rosswalker.co.uk>
>> napsal/-a:
>>>>
>>>>> TLDNR: NVIDIA RTX2080 works with AMBER 18, gives the correct answers
>> in provisional tests and gets performance equivalent to a 1080TI as long as
>> you don't put more than 2 in a box without some kind of custom cooling
>> solution. NVIDIA price inflation is alive and kicking so perf per dollar is
>> down ~15%.
>>>>>
>>>>>
>>>>> Dear Amberites
>>>>>
>>>>> I have finally managed to get my hands on some reference design
>> RTX2080 GPUs (another month or so for 2080TI) and had a chance to test them
>> with AMBER 18. First impressions are that the reference design cooler for
>> the RTX series is crap.
>>>>>
>>>>>
>>>>>
>>>>> Unless there is a big space (at least 1 PCI-E unit, but ideally 2)
>> between cards then the cards massively overheat, even when running their
>> fans at 100% which causes them to significantly downclock. The following is
>> an example with 4 cards running at once:
>>>>>
>>>>> Thu Oct 4 21:08:08 2018
>>>>>
>> +-----------------------------------------------------------------------------+
>>>>> | NVIDIA-SMI 410.57 Driver Version: 410.57
>> |
>>>>>
>> |-------------------------------+----------------------+----------------------+
>>>>> | GPU Name Persistence-M| Bus-Id Disp.A | Volatile
>> Uncorr. ECC |
>>>>> | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util
>> Compute M. |
>>>>>
>> |===============================+======================+======================|
>>>>> | 0 GeForce RTX 2080 On | 00000000:19:00.0 Off |
>> N/A |
>>>>> | 42% 76C P2 178W / 215W | 281MiB / 7952MiB | 95%
>> Default |
>>>>>
>> +-------------------------------+----------------------+----------------------+
>>>>> | 1 GeForce RTX 2080 On | 00000000:1A:00.0 Off |
>> N/A |
>>>>> | 90% 87C P2 112W / 215W | 281MiB / 7952MiB | 96%
>> Default |
>>>>>
>> +-------------------------------+----------------------+----------------------+
>>>>> | 2 GeForce RTX 2080 On | 00000000:67:00.0 Off |
>> N/A |
>>>>> | 93% 87C P2 100W / 215W | 281MiB / 7952MiB | 96%
>> Default |
>>>>>
>> +-------------------------------+----------------------+----------------------+
>>>>> | 3 GeForce RTX 2080 On | 00000000:68:00.0 Off |
>> N/A |
>>>>> |100% 87C P2 74W / 215W | 281MiB / 7951MiB | 97%
>> Default |
>>>>>
>> +-------------------------------+----------------------+----------------------+
>>>>> Note the top card runs well maintaining 76C at 178W with the fan at
>> just 42%. The remaining cards are close to 100% fan speed, thermal limited
>> at 87C and only drawing ~100W. That means they have clocked down
>> significantly and are still overheating. I am working with Exxact to
>> engineer a solution and I am confident we can get these working in 4 and
>> 8xGPU configs but for the time being if you are building your own stock
>> machines do not put more than 2 of these in a box and make sure you space
>> them out. PNY are going to make more traditional blower design versions of
>> the RTX GPUs which hopefully will not have this cooling issue. I should
>> have a chance to test some of those in a few weeks.
>>>>>
>>>>> The good news is the AMBER 18 test cases, and the validation suites I
>> have, all pass. Performance of the RTX2080 is on par (as long as you have
>> the cards spaced out or have an auxiliary cooling solution), with the
>> 1080TI, which is what history, and ratio'ing the flop counts, would have us
>> expect.
>>>>>
>>>>> Note these are provisional numbers for the 2080. Performance may
>> improve some once optimizations for the SM7.5 hardware have been made
>> although I wouldn't expect any miracles.
>>>>>
>>>>> JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
>>>>> -----------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 761.01 seconds/ns =
>> 113.53
>>>>> 1080TI 1 x GPU: | ns/day = 776.83 seconds/ns =
>> 111.22
>>>>>
>>>>> JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
>>>>> -----------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 713.93 seconds/ns =
>> 121.02
>>>>> 1080TI 1 x GPU: | ns/day = 733.55 seconds/ns =
>> 117.78
>>>>>
>>>>> JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
>>>>> -----------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 399.97 seconds/ns =
>> 216.02
>>>>> 1080TI 1 x GPU: | ns/day = 409.26 seconds/ns =
>> 211.11
>>>>>
>>>>> JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
>>>>> -----------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 367.69 seconds/ns =
>> 234.98
>>>>> 1080TI 1 x GPU: | ns/day = 377.10 seconds/ns =
>> 229.12
>>>>>
>>>>> FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
>>>>> -------------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 130.48 seconds/ns =
>> 662.19
>>>>> 1080TI 1 x GPU: | ns/day = 121.95 seconds/ns =
>> 708.48
>>>>>
>>>>> FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
>>>>> -------------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 123.86 seconds/ns =
>> 697.55
>>>>> 1080TI 1 x GPU: | ns/day = 113.46 seconds/ns =
>> 761.48
>>>>>
>>>>> CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
>>>>> --------------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 26.73 seconds/ns =
>> 3232.72
>>>>> 1080TI 1 x GPU: | ns/day = 26.14 seconds/ns =
>> 3305.84
>>>>>
>>>>> CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
>>>>> --------------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 25.30 seconds/ns =
>> 3414.99
>>>>> 1080TI 1 x GPU: | ns/day = 24.72 seconds/ns =
>> 3495.08
>>>>>
>>>>> STMV_PRODUCTION_NPT - 1,067,095 atoms PME
>>>>> -----------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 16.17 seconds/ns =
>> 5344.36
>>>>> 1080TI 1 x GPU: | ns/day = 15.09 seconds/ns =
>> 5727.20
>>>>>
>>>>> TRPCAGE_PRODUCTION - 304 atoms GB
>>>>> ---------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 1191.46 seconds/ns =
>> 72.52
>>>>> 1080TI 1 x GPU: | ns/day = 1301.05 seconds/ns =
>> 66.41
>>>>>
>>>>> MYOGLOBIN_PRODUCTION - 2,492 atoms GB
>>>>> -------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 490.54 seconds/ns =
>> 176.13
>>>>> 1080TI 1 x GPU: | ns/day = 448.95 seconds/ns =
>> 192.45
>>>>>
>>>>> NUCLEOSOME_PRODUCTION - 25,095 atoms GB
>>>>> ---------------------------------------
>>>>>
>>>>> 2080 1 x GPU: | ns/day = 10.89 seconds/ns =
>> 7935.83
>>>>> 1080TI 1 x GPU: | ns/day = 10.14 seconds/ns =
>> 8520.32
>>>>>
>>>>> Note if you compare against 4 GPUs with no additional cooling the
>> clocking down of the 2080s is obvious.
>>>>>
>>>>> JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs - Runnin 4 independent
>> calculations at once.
>>>>>
>>>>> 2080
>>>>> [0] 1 x GPU: | ns/day = 761.27 seconds/ns =
>> 113.49
>>>>> [1] 1 x GPU: | ns/day = 676.54 seconds/ns =
>> 127.71
>>>>> [2] 1 x GPU: | ns/day = 649.64 seconds/ns =
>> 133.00
>>>>> [3] 1 x GPU: | ns/day = 441.83 seconds/ns =
>> 195.55
>>>>>
>>>>> 1080TI
>>>>> [0] 1 x GPU: | ns/day = 776.57 seconds/ns =
>> 111.26
>>>>> [1] 1 x GPU: | ns/day = 779.75 seconds/ns =
>> 110.81
>>>>> [2] 1 x GPU: | ns/day = 773.42 seconds/ns =
>> 111.71
>>>>> [3] 1 x GPU: | ns/day = 747.29 seconds/ns =
>> 115.62
>>>>>
>>>>> So in summary:
>>>>>
>>>>> 2080 works with AMBER 18, gives the correct answers in provisional
>> tests and gets performance equivalent to a 1080TI as long as you don't put
>> more than 2 in a box without some kind of custom cooling solution.
>>>>>
>>>>> Of course this is 'modern' NVIDIA so price inflation is the name of
>> the game so while the performance matches the performance per dollar is
>> significantly worse. 1080TI Founders MSRP was $699, 2080 Founders MSRP is
>> $799 so performance per $ has decreased approximately 15%.
>>>>>
>>>>> All the best
>>>>> Ross
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> AMBER mailing list
>>>>> AMBER.ambermd.org
>>>>> http://lists.ambermd.org/mailman/listinfo/amber
>>>>
>>>>
>>>> --
>>>> Vytvořeno poštovní aplikací Opery: http://www.opera.com/mail/
>>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Nov 20 2019 - 09:00:02 PST
Custom Search