Hi Vincenzo,
Following up on this I managed to benchmark an RTX6000ADA - which is roughly equivalent to the L40. The numbers are below - pretty close to what I predicted. It more or less all comes down to the memory bandwidth and achievable TDP.
JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
-----------------------------------------
1 x H100 GPU: | ns/day = 1479.32
1 x 4090 GPU: | ns/day = 1638.75
1 x 6000ADA GPU: | ns/day = 1585.07
JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
-----------------------------------------
1 x H100 GPU: | ns/day = 1424.90
1 x 4090 GPU: | ns/day = 1618.45
1 x 6000ADA GPU: | ns/day = 1537.63
JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
-----------------------------------------
1 x H100 GPU: | ns/day = 779.95
1 x 4090 GPU: | ns/day = 883.23
1 x 6000ADA GPU: | ns/day = 845.71
JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
-----------------------------------------
1 x H100 GPU: | ns/day = 741.10
1 x 4090 GPU: | ns/day = 842.69
1 x 6000ADA GPU: | ns/day = 806.03
FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
-------------------------------------------
1 x H100 GPU: | ns/day = 389.18
1 x 4090 GPU: | ns/day = 466.44
1 x 6000ADA GPU: | ns/day = 447.05
FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
-------------------------------------------
1 x H100 GPU: | ns/day = 357.88
1 x 4090 GPU: | ns/day = 433.24
1 x 6000ADA GPU: | ns/day = 420.95
CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
--------------------------------------------
1 x H100 GPU: | ns/day = 119.27
1 x 4090 GPU: | ns/day = 129.63
1 x 6000ADA GPU: | ns/day = 119.10
CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
--------------------------------------------
1 x H100 GPU: | ns/day = 108.91
1 x 4090 GPU: | ns/day = 119.04
1 x 6000ADA GPU: | ns/day = 110.92
STMV_PRODUCTION_NPT - 1,067,095 atoms PME
-----------------------------------------
1 x H100 GPU: | ns/day = 70.15
1 x 4090 GPU: | ns/day = 78.90
1 x 6000ADA GPU: | ns/day = 71.68
TRPCAGE_PRODUCTION - 304 atoms GB
---------------------------------
1 x H100 GPU: | ns/day = 1413.28
1 x 4090 GPU: | ns/day = 1482.22
1 x 6000ADA GPU: | ns/day = 1393.89
MYOGLOBIN_PRODUCTION - 2,492 atoms GB
-------------------------------------
1 x H100 GPU: | ns/day = 1094.48
1 x 4090 GPU: | ns/day = 929.62
1 x 6000ADA GPU: | ns/day = 951.43
NUCLEOSOME_PRODUCTION - 25,095 atoms GB
---------------------------------------
1 x H100 GPU: | ns/day = 37.68
1 x 4090 GPU: | ns/day = 36.90
1 x 6000ADA GPU: | ns/day = 33.85
All the best
Ross
> On Mar 17, 2023, at 11:03, Ross Walker <ross.rosswalker.co.uk> wrote:
>
> Hi Vincenzo,
>
> The L40 GPU is effectively a passively cooled version of the RTX 6000 ADA. Here's the specs of each:
>
> L40: https://images.nvidia.com/content/Solutions/data-center/vgpu-L40-datasheet.pdf
> RTX6000 ADA: https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/rtx-6000/proviz-print-rtx6000-datasheet-web-2504660.pdf
>
> From this you can see that the L40 is essentially a slightly underclocked (noticeably on memory), passively cooled, but higher priced, version of RTX6000 ADA.
>
> The key numbers are:
>
> Mem bw FP32 Tflops CUDA cores TDP MSRP
> L40 864 GB/s 90.5 18,176 300W ~$7600
> 6000 ADA 960 GB/s 91.1 18,176 300W ~$6800
> RTX4090 1008 GB/s 82.6 16,384 450W ~$1599
>
> I haven't had a chance to test the RTX6000 ADA or L40 yet but based on the numbers above I expect the performance of the L40 to be roughly equivalent to the 6000 ADA and for those to be a few % slower than the 4090. The main reason being that while the L40/6000ADA have higher FP32 flops on paper they have lower memory bandwidth and a lower max power limit. The later tends to be a big influencer on performance these days since as soon as the GPU hits the power limit (or temp limit) which it easily can these days the boost clocks get scaled back to stay within the power cap. Hence you don't get the full performance that the specs suggest.
>
> My 0.02btc would be to wait until the 4000, 4500, 5000, 5500 versions of the RTX ADA come out since while not quite as fast they will have significantly better price/performance than the L40 or 6000 ADA which are, in my opinion, priced ridiculously. If you need to upgrade today, and money is an object, your best bet is probably to go with the A5500 which are available with a fairly big discount due to being end of line or if you like living on the edge go with 2 slot wide 4090's - although these are grey market and you'll have to promise not to tell NVIDIA who you got them from. ;-)
>
> All the best
> Ross
>
>> On Mar 17, 2023, at 06:45, Vincenzo D'Amore via AMBER <amber.ambermd.org> wrote:
>>
>>
>> To whom in may concern:
>>
>> Hello, my name is Vincenzo D'Amore and I am a researcher at the Department of Pharmacy of University of Naples Federico II.
>>
>> We are planning an upgrade in our local HPC system and we are thinking to buy the new Nvidia GPUs L40 (released in November 2022). Our choice is driven by the information provided by NVIDIA which describe the L40 as the most potent graphic card in FP32 calculations (outperforming even the RTX4090 version).
>>
>> So, I was wondering if you already had the opportinity to test or have any preliminary data about the performances of this GPU in Amber calculations.
>>
>> Looking forward to hearing your reply,
>>
>> Kindest regards,
>>
>> Vincenzo D'Amore
>>
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> https://urldefense.com/v3/__http://lists.ambermd.org/mailman/listinfo/amber__;!!Mih3wA!Ca2E27mwI5jR1YHErR2Rjck25qvNvDxMJD4Kedpgy1CztwRyx19lYNj0NWqOlZinS2tnf9qPAKo$
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Mar 21 2023 - 13:30:03 PDT