Re: [AMBER] Segmentation fault when running on GPU, working fine on CPU from Markowska on 2025-08-11 (Amber Archive Aug 2025)

From: Markowska <amber.ambermd.org>
Date: Mon, 11 Aug 2025 11:43:48 +0200

Hi Zhen,

I tried the first suggestion and unfortunately still have the same error. I
increased the --mem value even higher than 32G, but still no luck.
So I just sent an email to the HPC support about the function. I hope this
will be the fix I need :)

Best regards,
Karolina

pt., 8 sie 2025 o 17:17 Li, Zhen <lizhen6.chemistry.msu.edu> napisał(a):

> Hi Karolina,
>
> Thank you for providing the detailed error report. I see the error comes
> from src/pmemd/src/cuda/gti_f95.cpp, where one of the atom lists is
> accessing unallocated memory.
>
> Here are two solutions again.
>
> (1) A simple attempt is to request more memory for your slurm job. For
> example, if you were requesting #SBATCH --mem=16G, try #SBATCH --mem=32G
> instead.
>
> (2) If (1) fails, contact your IT support of the HPC and see whether they
> can access the source code of src/pmemd/src/cuda/gti_f95.cpp at about
> line 560, namely the whole gti_lj1264_nb_setup() function (please see the
> correct function below as a reference). I have encountered some schools'
> HPC installing AMBER22, but the source code was still from AMBER20 somehow.
> Most importantly, this line int index = ico[(*ntypes) * (atm_iac[i] - 1)
> + atm_iac[i] - 1] - 1; should not be int index = ico[(*ntypes) *
> (atm_iac[i] - 1) + atm_iac[i]] - 1;. Hope it makes sense to the HPC
> staff.
>
> Best regards,
> Zhen.
>
>
>
> //---------------------------------------------------------------------------------------------
> // gti_lj1264_nb_setup_:
> //
> // Arguments:
> // atm_isymbl:
> // ntypes:
> // atm_iac:
> // ico:
> // gbl_cn6:
>
> //---------------------------------------------------------------------------------------------
> extern "C" void gti_lj1264_nb_setup_(char atm_isymbl[][4], int* ntypes,
> int atm_iac[],
> int ico[], double gbl_cn6[])
> {
> PRINTMETHOD(__func__);
>
> gpuContext gpu = theGPUContext::GetPointer();
> unsigned natoms = gpu->sim.atoms;
> unsigned nlistatoms = 0;
> unsigned* list = new unsigned[natoms];
>
> for (unsigned i = 0; i < natoms; i++) {
> if (atm_iac[i] > 0) {
> for (unsigned j = 1; j < 4; j++) {
> char tt = atm_isymbl[i][j];
> if (tt == '+' || tt == '-') {
> int index = ico[(*ntypes) * (atm_iac[i] - 1) + atm_iac[i] - 1] -
> 1;
> if (index >= 0) {
> if (abs(gbl_cn6[index]) > 1e-10) {
> list[nlistatoms] = i;
> nlistatoms++;
> break;
> }
> }
> }
> }
> }
> }
> gpu->sim.numberLJ1264Atoms = nlistatoms;
> if (nlistatoms > 0) {
> gpu->pbLJ1264AtomList = std::unique_ptr< GpuBuffer<unsigned>
> >(new GpuBuffer<unsigned>(nlistatoms));
> for (unsigned i = 0; i < nlistatoms; i++) {
> gpu->pbLJ1264AtomList->_pSysData[i] = list[i];
> }
> gpu->pbLJ1264AtomList->Upload();
> gpu->sim.pLJ1264AtomList = gpu->pbLJ1264AtomList->_pDevData;
> unsigned maxNumberLJ1264NBEntries = gpu->sim.numberLJ1264Atoms *
> gti_simulationConst::MaxNumberNBPerAtom;
> gpu->pbLJ1264NBList = std::unique_ptr< GpuBuffer<int4>
> >(new GpuBuffer<int4>(maxNumberLJ1264NBEntries));
> for (unsigned i = 0; i < maxNumberLJ1264NBEntries; i++) {
> gpu->pbLJ1264NBList->_pSysData[i] = { -1, -1, -1, -1 };
> }
> gpu->pbLJ1264NBList->Upload();
> gpu->sim.pLJ1264NBList = gpu->pbLJ1264NBList->_pDevData;
> gpu->pbNumberLJ1264NBEntries = std::unique_ptr< GpuBuffer<unsigned
> long long int> >(new GpuBuffer<unsigned long long int>(1));
> gpu->pbNumberLJ1264NBEntries->_pSysData[0] = maxNumberLJ1264NBEntries;
> gpu->pbNumberLJ1264NBEntries->Upload();
> gpu->sim.pNumberLJ1264NBEntries =
> gpu->pbNumberLJ1264NBEntries->_pDevData;
> //printf("end of gti_lj1264_nb_setup_%d\n", nlistatoms); // debug2026
> }
>
> delete[] list;
> gpuCopyConstants();
> }
>
>
>
> _____________________
>
> Zhen Li <http://lizhen62017.wixsite.com/home>, Ph.D.,
>
> The Merz Research Group <http://merzgroup.org>,
>
> Michigan State University,
>
> Cleveland Clinic.
> ------------------------------
> *From:* Karolina Mitusińska (Markowska) <markowska.kar.gmail.com>
> *Sent:* Friday, August 8, 2025 5:51 AM
> *To:* Li, Zhen <lizhen6.chemistry.msu.edu>
> *Cc:* AMBER Mailing List <amber.ambermd.org>
> *Subject:* Re: [AMBER] Segmentation fault when running on GPU, working
> fine on CPU
>
> Hi Zhen,
>
> thank you so much for the investigation.
> As for your 1st option - I'm unable to upgrade to Amber24. I can suggest
> this to the IT support of the HPC that I'm using for the simulations, but
> I'm not sure if it's possible.
> For the 2nd option - I checked the output file and also the cluster
> itself. I'm using Amber22 on an A100 GPU (NVIDIA A100-SXM-64GB) but with a
> slightly older cuda version, 12.2. And it fails again.
>
> Also, the error message I'm getting is different from yours, it's the
> following:
> Backtrace for this error:
> #0 0x1517ea44ab4f in ???
> #1 0x60e86d in gti_lj1264_nb_setup_
> at
> /dev/shm/propro01/spack-stage-amber-22-cz7v3y4nrcoxnjsgdwukvsexakhx2k5k/spack-src/src/pmemd/src/cuda/gti_f95.cpp:555
> #2 0x49b92b in __extra_pnts_nb14_mod_MOD_nb14_setup
> at
> /dev/shm/propro01/spack-stage-amber-22-cz7v3y4nrcoxnjsgdwukvsexakhx2k5k/spack-src/src/pmemd/src/extra_pnts_nb14.F90:540
> #3 0x505e30 in __pme_alltasks_setup_mod_MOD_pme_alltasks_setup
> at
> /dev/shm/propro01/spack-stage-amber-22-cz7v3y4nrcoxnjsgdwukvsexakhx2k5k/spack-src/src/pmemd/src/pme_alltasks_setup.F90:251
> #4 0x4e23f9 in pmemd
> at
> /dev/shm/propro01/spack-stage-amber-22-cz7v3y4nrcoxnjsgdwukvsexakhx2k5k/spack-src/src/pmemd/src/pmemd.F90:518
> #5 0x411fbc in main
> at
> /dev/shm/propro01/spack-stage-amber-22-cz7v3y4nrcoxnjsgdwukvsexakhx2k5k/spack-src/src/pmemd/src/pmemd.F90:77
> /var/spool/slurmd/job18638520/slurm_script: line 30: 2685272 Segmentation
> fault $AMBERHOME/bin/pmemd.cuda -O -i relax03.in
> <https://urldefense.com/v3/__http://relax03.in__;!!HXCxUKc!zvTv-5or-3znMDRLZkKxbdpun41TZaLst4753ZPW1VsiZL45XV0K9_cQQUbU1yslZkr_lPPLNiuoRIV_tWjmL6BQ5nGo7No$>
> -p $TOP -c $CRD -ref $CRD -o ${NAME}_relax03.out -r ${NAME}_relax03.rst7 -x
> ${NAME}_relax03.nc
> <https://urldefense.com/v3/__http://relax03.nc__;!!HXCxUKc!zvTv-5or-3znMDRLZkKxbdpun41TZaLst4753ZPW1VsiZL45XV0K9_cQQUbU1yslZkr_lPPLNiuoRIV_tWjmL6BQ1cLFhvs$>
>
> Best regards,
> Karolina
>
> czw., 7 sie 2025 o 18:43 Li, Zhen <lizhen6.chemistry.msu.edu> napisał(a):
>
> Hi Karolina,
>
> Thank you for waiting. After thoroughly reviewing the code, I have
> identified several ways to avoid the segfault from my end. Hopefully, it
> will work for you.
>
>
> 1. Upgrade to AMBER24 if possible. There was an update to optimize the
> GPU memory allocation for the 1264 code. The old code defines the
> allocation factor by only checking whether the architecture is Pascal,
> Volta or Ampere, but now there are new GPU architectures like Hopper, etc.
> (Please see the source code below)
> 2. Try to stick to K80, V100, and A100 GPUs. I have tested the 1264
> code on those devices with CUDA 12.3, and it is working fine on my end.
>
>
> If it is still not working, thank you for passing the error message to me.
> Does it look like "of length = 42Failed an illegal memory access was
> encountered", or "segfault with several rows of address printed"?
>
> Thank you again.
> Zhen
>
>
> //---------------------------------------------------------------------------------------------
> // ik_Build1264NBList:
> //
> // Arguments:
> // gpu: overarching data structure containing simulation
> information, here used
> // for stream directions and kernel launch parameters
>
> //---------------------------------------------------------------------------------------------
> void ik_Build1264NBList(gpuContext gpu)
> {
> if (gpu->sim.numberLJ1264Atoms == 0) {
> return;
> }
> int nterms = gpu->sim.atoms / GRID;
>
> unsigned threadsPerBlock = (isDPFP) ? 128 : ((gpu->major < 6) ? 768 :
> 256);
> unsigned factor = 1;
> unsigned blocksToUse = (isDPFP) ? gpu->blocks : min((nterms /
> threadsPerBlock ) + 1,
> gpu->blocks *
> factor);
>
> kgBuildSpecial2RestNBPreList_kernel<<<blocksToUse, threadsPerBlock,
> 0,
> gpu->mainStream>>>(GTI_NB::LJ1264);
> LAUNCHERROR("kgBuildSpecial2RestNBPreList");
>
> nterms = gpu->sim.atoms;
>
> threadsPerBlock = (isDPFP) ? 128 : 512; // Tuned w/ M2000M
> threadsPerBlock = min(threadsPerBlock, MAX_THREADS_PER_BLOCK);
> factor = (PASCAL || VOLTA || AMPERE) ? 1 : 2;
> blocksToUse = (isDPFP) ? gpu->blocks : min((nterms / threadsPerBlock) +
> 1,
> gpu->blocks * factor);
>
> kgBuildSpecial2RestNBList_kernel<<<blocksToUse, threadsPerBlock,
> 0, gpu->mainStream>>>(GTI_NB::LJ1264);
> LAUNCHERROR("kgBuildSpecial2RestNBList");
>
> nterms = gpu->sim.numberLJ1264Atoms * 400;
> threadsPerBlock = (isDPFP) ? 128 : ((PASCAL || VOLTA || AMPERE) ? 64 :
> 1024);
> factor = (PASCAL || VOLTA || AMPERE) ? 4 : 1;
> blocksToUse = (isDPFP) ? gpu->blocks : min((nterms / threadsPerBlock) +
> 1,
> gpu->blocks * factor);
>
> kg1264NBListFillAttribute_kernel <<<blocksToUse, threadsPerBlock,
> 0, gpu->mainStream>>>();
> LAUNCHERROR("kgBuild1264NBListFillAttribute");
> }
>
>
>
> _____________________
>
> Zhen Li
> <https://urldefense.com/v3/__http://lizhen62017.wixsite.com/home__;!!HXCxUKc!zvTv-5or-3znMDRLZkKxbdpun41TZaLst4753ZPW1VsiZL45XV0K9_cQQUbU1yslZkr_lPPLNiuoRIV_tWjmL6BQwwLWvcA$>,
> Ph.D.,
>
> The Merz Research Group
> <https://urldefense.com/v3/__http://merzgroup.org__;!!HXCxUKc!zvTv-5or-3znMDRLZkKxbdpun41TZaLst4753ZPW1VsiZL45XV0K9_cQQUbU1yslZkr_lPPLNiuoRIV_tWjmL6BQNMGfkEo$>
> ,
>
> Michigan State University,
>
> Cleveland Clinic.
> ------------------------------
> *From:* Karolina Mitusińska (Markowska) <markowska.kar.gmail.com>
> *Sent:* Sunday, August 3, 2025 5:58 AM
> *To:* Li, Zhen <lizhen6.chemistry.msu.edu>; David A Case <
> dacase1.gmail.com>
> *Cc:* AMBER Mailing List <amber.ambermd.org>
> *Subject:* Re: [AMBER] Segmentation fault when running on GPU, working
> fine on CPU
>
> Dear prof. Case and Zhen,
>
> thank you for your hints.
> I tried to run minimization using sander.MPI, however with the same result
> - it goes well at first (I ran 2500 steps of sander.MPI minimization), but
> when I switch to pmemd.cuda, it crashes again with segmentation fault. And
> of course for the non-1264 prmtop everything goes fine.
> So now I believe it must be related to the 12-6-4 params. But why? I'm
> definitely using Amber 22, I checked that again.
>
> I'm using a modified set of LJ parameters that I got for tests, and
> therefore I don't want to paste them here on the list, but will share them
> with the developers if needed. I managed to run them on CPU starting from
> minimization up to heating to 300 K, but now I would like to switch to GPU
> and it's impossible because of the seg fault. Are there any more general
> hints that I could use to try and run the simulations on GPU?
>
> Best,
> Karolina
>
> niedz., 3 sie 2025 o 00:51 Li, Zhen <lizhen6.chemistry.msu.edu>
> napisał(a):
>
> Hi Karolina,
>
> Dr. Case pointed out a very helpful way of debugging it. Could you
> double-check whether your AMBER version is 22 or 20? There is a known bug
> in AMBER20 GPU 1264 (see the red paragraph here
> <https://urldefense.com/v3/__https://ambermd.org/tutorials/advanced/tutorial20/m1264.php__;!!HXCxUKc!yGo5O-svaNTy8fr_Taj7ZFKNnlwDgjyZYNpSbMFO4u-tUA1Iy5j1Zr_Triv66NIPCqhlGHBkd07rx41pvxIrqOTOMBbH4Nw$>),
> where applying C4 to the last atom type results in a segfault because the
> code fails to update atom type indexing from [1, 2, 3,...] to [0, 1,
> 2,...]. It was later patched in AMBER22.
>
> Hope the debugging went well. Another helpful way for us developers is to
> provide the printljmatrix output as both the 1264 and m1264 tutorials show.
> Thank you very much!
>
> Best regards,
> Zhen.
>
> _____________________
>
> Zhen Li
> <https://urldefense.com/v3/__http://lizhen62017.wixsite.com/home__;!!HXCxUKc!yGo5O-svaNTy8fr_Taj7ZFKNnlwDgjyZYNpSbMFO4u-tUA1Iy5j1Zr_Triv66NIPCqhlGHBkd07rx41pvxIrqOTOFn0VuUY$>,
> Ph.D.,
>
> The Merz Research Group
> <https://urldefense.com/v3/__http://merzgroup.org__;!!HXCxUKc!yGo5O-svaNTy8fr_Taj7ZFKNnlwDgjyZYNpSbMFO4u-tUA1Iy5j1Zr_Triv66NIPCqhlGHBkd07rx41pvxIrqOTONmnT6_M$>
> ,
>
> Michigan State University,
>
> Cleveland Clinic.
> ------------------------------
> *From:* David A Case via AMBER <amber.ambermd.org>
> *Sent:* Saturday, August 2, 2025 5:26 PM
> *To:* Karolina Mitusińska (Markowska) <markowska.kar.gmail.com>; AMBER
> Mailing List <amber.ambermd.org>
> *Subject:* Re: [AMBER] Segmentation fault when running on GPU, working
> fine on CPU
>
> On Sat, Aug 02, 2025, Karolina Mitusińska (Markowska) via AMBER wrote:
> >
> >I'm facing an interesting issue with Amber22.
> >I want to use the 12-6-4 LJ parameters for my system, using the following
> >tutorial:
> https://urldefense.com/v3/__https://ambermd.org/tutorials/advanced/tutorial20/12_6_4.php__;!!HXCxUKc!19t96I0y_fKwtF7RjzkwpL7XdsRD_mdpkRdbUPq2WEjQyohDhosUmFB7oZ-T2heW3RTRiQaRwifP000SN23bV3w$
> >I prepared the system using the frcmod.ions234lm_1264_tip3p for my system
> >solvated in TIP3P water model. I generated the .inpcrd and .prmtop files
> >without any errors using tLeaP.
> >Then I used parmed to add the C coefficient parameters to the system.
> >Parmed did not report any issues with the files.
> >
> >But when I tried to run minimization on the system, I'm seeing a
> >segmentation fault error and the output of the minimization ends at the
> >following line:
>
> We need to first figure out if the seg fault has anything to do with
> 12-6-4.
> Run a few steps of minimization with sander.MPI, say 25 steps with ntpr=1
> and ntmin=3.
>
> Don't worry that it is slow: you generally only need to do a few hundred
> steps of minimization with sander. If you are lucky, you can then go back
> to pmemd.cuda and continue with more minimzation or with MD.
>
> There are many strange failures that can happen with minimization with
> pmemd.cuda, which is why I am suggesting this. Of course, if the sander
> run
> also fails, there may a 12-6-4 specific problem. But it might give you
> better error messages.
>
> ...good luck...dac
>
>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
>
> https://urldefense.com/v3/__http://lists.ambermd.org/mailman/listinfo/amber__;!!HXCxUKc!19t96I0y_fKwtF7RjzkwpL7XdsRD_mdpkRdbUPq2WEjQyohDhosUmFB7oZ-T2heW3RTRiQaRwifP000Sr5y_jYc$
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon Aug 11 2025 - 03:00:03 PDT