Dear AMBER users,
I am seeking advice on troubleshooting a recurring GPU crash during the equilibration of a peptide-MHC/â2M complex ~80,018 atoms using AMBER22 (pmemd.cuda) on NVIDIA A100 nodes. (04_equil_k5.out)
The Problem: My simulation crashes immediately at the start of the first NPT equilibration stage (equil_npt_k5_300ps.in) after successfully completing a 300 ps NVT heating ramp (heat_nvt_300ps.in). The error reported in the SLURM output is:
Error: an illegal memory access was encountered launching kernel kNLSkinTest
System Details:
* Protein Complex: peptide:MHC:â2M
* Force Field: ff19SB.
* Water Model: OPC (4-point).
*
Target Sampling: 3 x 1000ns (3 usec total).
Context & Comparison: Interestingly, a previous shorter protocol (100 ps NVT heating (heat.in) followed by 100 ps NPT equilibration (equil.in) worked perfectly for a 10 ns pilot production run. However, when I extended the stages to 300 ps each for better thermalization in preparation for my study, this "Illegal Memory Access" error began occurring.
Observations:
1.
Heating Stage: Completed successfully. Final temperature was stable at ~300 K (03_heat.out).
2. Equilibration Stage: I implemented a "Safe Start" with dt=0.001 (1.0 fs) and skinnb=5.0 to help the Monte Carlo barostat (barostat=2) handle the initial pressure shock, but it still crashes before the first 1,000 steps.
3. The Error: kNLSkinTest suggests an issue with the neighbor list. I have tried increasing the skinnb to 5.0, but the crash persists.
I have attached the relevant .in and .out files for comparison. Any insights on how to stabilize for 80k atom system for microsecond-scale production would be greatly appreciated.
Best regards,
Arun
Arun Gupta PhD, MRSC
Sr. Post Doctoral Research Assistant
Gillespie/McMichael Group
Nuffield Dept. of Clinical Medicine,
University of Oxford,
Centre for Immuno-Oncology
Old Road Campus Research Building,
Roosevelt Drive, Oxford
OX3 7DQ
Tel: + 44 (0)1865 2612913
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
- application/octet-stream attachment: equil.in
- application/octet-stream attachment: heat.in
Received on Thu Feb 26 2026 - 05:00:03 PST