Hi Amberites.
I finally got my hands on some RTX4090 GPUs. They are serious power hogs - pulling 450W each so two of these coupled with a 64 core threadripper CPU and you start getting close to what a typical 120V 15amp US circuit can supply. Note they are also 4 slots wide and 14 inches long so only fit full tower cases.
Here's the initial benchmarks I got for Amber 22 out of the box (with minor modifications to cmake for CUDA 11.8). I've included 3090 and A100 numbers for comparison. It should be possible, especially with the smaller cases, to get some improvements with Lovelace specific optimizations but this is what you will get out of the box for now.
A more comprehensive list of benchmarks by GPU model are available here: https://www.exxactcorp.com/blog/Molecular-Dynamics/RTX3090-Benchmarks-for-HPC-AMBER22-A100-vs-RTX3080-vs-RTX3070-vs-RTX6000
System spec.
1 x AMD Ryzen Threadripper PRO 5995WX 64-Cores
2 x Zotac RTX4090 GPUs
CUDA 11.8.r11.8/compiler.31833905_0
Driver 520.61.05
Amber 22 + AmberTools 22 with updates as of Oct 14th 2022
JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
-----------------------------------------
CPU code 64 cores: | ns/day = 154.28
1 x 4090 GPU: | ns/day = 1638.75
1 x 3090 GPU: | ns/day = 1196.50
1 x A100 GPU: | ns/day = 1199.22
JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
-----------------------------------------
CPU code 64 cores: | ns/day = 154.19
1 x 4090 GPU: | ns/day = 1618.45
1 x 3090 GPU: | ns/day = 1157.76
1 x A100 GPU: | ns/day = 1194.50
JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
-----------------------------------------
CPU code 64 cores: | ns/day = 84.71
1 x 4090 GPU: | ns/day = 883.23
1 x 3090 GPU: | ns/day = 632.19
1 x A100 GPU: | ns/day = 611.08
JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
-----------------------------------------
CPU code 64 cores: | ns/day = 80.79
1 x 4090 GPU: | ns/day = 842.69
1 x 3090 GPU: | ns/day = 595.28
1 x A100 GPU: | ns/day = 610.09
FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
-------------------------------------------
CPU code 64 cores: | ns/day = 19.98
1 x 4090 GPU: | ns/day = 466.44
1 x 3090 GPU: | ns/day = 264.78
1 x A100 GPU: | ns/day = 271.36
FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
-------------------------------------------
CPU code 64 cores: | ns/day = 20.80
1 x 4090 GPU: | ns/day = 433.24
1 x 3090 GPU: | ns/day = 248.65
1 x A100 GPU: | ns/day = 252.87
CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
--------------------------------------------
CPU code 64 cores: | ns/day = 4.03
1 x 4090 GPU: | ns/day = 129.63
1 x 3090 GPU: | ns/day = 63.23
1 x A100 GPU: | ns/day = 85.23
CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
--------------------------------------------
CPU code 64 cores: | ns/day = 3.83
1 x 4090 GPU: | ns/day = 119.04
1 x 3090 GPU: | ns/day = 58.30
1 x A100 GPU: | ns/day = 77.98
STMV_PRODUCTION_NPT - 1,067,095 atoms PME
-----------------------------------------
CPU code 64 cores: | ns/day = 2.07
1 x 4090 GPU: | ns/day = 78.90
1 x 3090 GPU: | ns/day = 38.65
1 x A100 GPU: | ns/day = 52.02
TRPCAGE_PRODUCTION - 304 atoms GB
---------------------------------
CPU code 64 cores: | ns/day = too many cores
1 x 4090 GPU: | ns/day = 1482.22
1 x 3090 GPU: | ns/day = 1225.53
1 x A100 GPU: | ns/day = 1040.61
MYOGLOBIN_PRODUCTION - 2,492 atoms GB
-------------------------------------
CPU code 64 cores: | ns/day = 46.18
1 x 4090 GPU: | ns/day = 929.62
1 x 3090 GPU: | ns/day = 621.73
1 x A100 GPU: | ns/day = 661.22
NUCLEOSOME_PRODUCTION - 25,095 atoms GB
---------------------------------------
CPU code 64 cores: | ns/day = 0.92
1 x 4090 GPU: | ns/day = 36.90
1 x 3090 GPU: | ns/day = 21.08
1 x A100 GPU: | ns/day = 29.66
All the best
Ross
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Oct 14 2022 - 09:00:02 PDT