Re: [AMBER] pmemd.cuda error: invalid argument launching kernel kgBuildSpecial2RestNBPreList

From: Patricio Barletta via AMBER <amber.ambermd.org>
Date: Fri, 21 Mar 2025 22:01:43 +0000

I was able to run your example on a GH200. Attaching the output:

https://www.dropbox.com/scl/fi/7ztge4orkmsnc992gyjhf/output.out?rlkey=0fn4gfbkw8i9alxsy35cnj1u4&st=ooqgezvf&dl=0


This is the exact line I used to compile amber on that platform:
```
cmake ../amber -DCMAKE_INSTALL_PREFIX=../install_lbsr_dev -DCOMPILER=GNU -DMPI=TRUE -DCUDA=TRUE -DINSTALL_TESTS=TRUE -DDOWNLOAD_MINICONDA=FALSE -DBUILD_PYTHON=ON -DNVIDIA_MATH_LIBS=${TACC_NVIDIA_MATH_LIB} –DBUILD_QUICK=OFF
```

Some additional ideas:

Next to the `pmemd.cuda` link, you'll find a pmemd.cuda_DPFP. Try to run your example with that binary.

Another thing you could try, if you're willing to, is go to the file amber/src/pmemd/src/cuda/gti_cuda.cu.

At line 1309 you'll find:

```
  unsigned threadsPerBlock = 768;
  unsigned factor = (PASCAL) ? 1 : 1;
  unsigned blocksToUse = (isDPFP) ? gpu->blocks : min((nterms / threadsPerBlock) + 1,
                                                      gpu->blocks*factor);
```

Replace that with:

```
  unsigned threadsPerBlock = 128;
  unsigned blocksToUse = 16;
  printf("gpu->blocks: %d\n", gpu->blocks); // Just out of curiosity. This should be equal to the number of SMs of the GH200
```

And see if that fixes it.
If any of these extra things fixes it, then it may mean that the cuda runtime is launching additional threads along with each kernel or occupying extra (shared?) memory. Make sure Amber wasn't compiled with debug symbols, and that there isn't any other process running on the same GPU at the same time while you're testing.

That's all I can think of.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 21 2025 - 15:30:02 PDT
Custom Search