[AMBER] crash o the pmemd.cuda

From: Enrico Martinez via AMBER <amber.ambermd.org>
Date: Fri, 2 Sep 2022 10:02:10 +0200

Dear Amber users!
The second time I have a crash of pmemd.cuda after 370ns of the
production run without any reasons for it. I checked the system and
did not observe any instabilities (like jump of RMSD or other
parameters).

in the nohup log I could found only this :

/home/enrico/Desktop/test_md/NMR/output_testDIM77/MDtest/dim_310K/runjob_gpu.sh:
line 24: 1625945 Killed pmemd.cuda -O -i
./in/md_production_310K.in -o prod_.out -p protein.prmtop -c
equil1p.rst -ref equil1p.rst -r prod_310K.rst -x prod_310K.netcdf

Note that at that moment two different simulations were executing
using 2 separate GPUs and both of them were killed.

.. and in the md.log the last string was:


 NSTEP =185695000 TIME(PS) = 372439.999 TEMP(K) = 309.94 PRESS = -47.5
 Etot = -146618.0456 EKtot = 66279.2422 EPtot = -212897.2878
 BOND = 1847.8633 ANGLE = 4826.5693 DIHED = 3445.0949
 UB = 0.0000 IMP = 0.0000 CMAP = 417.9705
 1-4 NB = 2148.6538 1-4 EEL = 23809.7680 VDWAALS = -12041.6860
 EELEC = -237351.5217 EHBOND = 0.0000 RESTRAINT = 0.0000
 EKCMT = 29711.4731 VIRIAL = 30994.1671 VOLUME = 1250021.3388
                                                    Density = 0.8550
 ------------------------------------------------------------------------------

wrapping first mol.: 78.33046 -55.38799 95.93482

How could I understand the reason for the crash? Was it due to the
overheating of the gpu or a problem with my simulation?
Many thanks in advance!
Enrico

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Sep 02 2022 - 01:30:02 PDT
Custom Search