Re: [AMBER] Multi-GPU Bug in Amber20 from James Kress on 2022-01-11 (Amber Archive Jan 2022)

From: James Kress <jimkress_58.kressworks.org>
Date: Tue, 11 Jan 2022 22:34:47 -0500

OK. Thanks.

Jim

-----Original Message-----
From: David A Case <david.case.rutgers.edu>
Sent: Tuesday, January 11, 2022 9:16 PM
To: jimkress_58.kressworks.org; AMBER Mailing List <amber.ambermd.org>
Subject: Re: [AMBER] Multi-GPU Bug in Amber20

On Tue, Jan 11, 2022, James Kress wrote:
>
>I noticed another oddity in the GPU output section.
>
>------------------- GPU DEVICE INFO --------------------
>|
>| CUDA_VISIBLE_DEVICES: 3
>| CUDA Capable Devices Detected: 1
>| CUDA Device ID in use: 0
>| CUDA Device Name: NVIDIA GeForce RTX 3090
>| CUDA Device Global Mem Size: 24268 MB
>| CUDA Device Num Multiprocessors: 82
>| CUDA Device Core Freq: 1.70 GHz
>|
>|--------------------------------------------------------
>
>While trying to benchmark each GPU individually I noticed this apparent
>anomaly in the output.
>
>I monitored the Amber pmemd.cuda process using nvidia-smi. The GPU 3
>was the only active GPU. I had set CUDA_VISIBLE_DEVICES=3 and Amber
>picks that up OK. However, the device ID in use by Amber is specified
>as 0. Shouldn't that be 3?

No. The code only sees "visible" GPUS, i.e. those in the CUDA_VISIBLE
DEVICES list. So CUDA Device ID of 0 means it is using the first GPU that
is visible to it. This happens to correspond to device 3 in the list that
nvidia-smi will provide.

....dac

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jan 11 2022 - 20:00:02 PST