Dear AMBER community,
I ran two identical calculations on (presumably different) GPUs and got completely different results. In the first run, the system blew up; in the second, everything looked fine. I did not change the input at all. How could this happen and are any of my results trustworthy? The output files are attached.
Background: I’m running TI simulations at different lambda values in a sequential fashion (e.g., 0.00, then 0.01, etc.) For each window, I’m running a minimization and an equilibration. Because I’m using the smooth softcore potential that is only implemented in pmemd.cuda, I’m running both the minimization and equilibration on GPUs. I know there are numerical issues with running minimizations on GPUs— Should I skip the minimizations or not use the smooth softcore potential for the minimizations so I can run on CPUs?
Sometimes the minimization on GPUs entirely goes awry:
NSTEP ENERGY RMS GMAX NAME NUMBER
1 6.1827E+08 4.4600E+03 4.3883E+05 H1 70188
BOND = ************* ANGLE = 4854.3827 DIHED = 7029.8038
VDWAALS = 93066.3839 EEL = -559385.5182 HBOND = 0.0000
1-4 VDW = 2030.7940 1-4 EEL = 30573.6921 RESTRAINT = 0.0700
EAMBER = *************
DV/DL = 81.9596
NMR restraints: Bond = 0.000 Angle = 0.000 Torsion = 0.070
===============================================================================
Softcore part of the system: 25 atoms, TEMP(K) = 0.00
SC_Etot= 0.0000 SC_EKtot= 0.0000 SC_EPtot = 79.4772
SC_BOND= 1.7211 SC_ANGLE= 13.7642 SC_DIHED = 18.4959
SC_14NB= 2.4757 SC_14EEL= 20.1833 SC_VDW = -0.0318
SC_EEL = 22.8688
SC_RES_DIST= 0.0000 SC_RES_ANG= 0.0000 SC_RES_TORS= 0.0000
SC_EEL_DER= 56.6513 SC_VDW_DER= -1.0518 SC_DERIV = 55.5995
———————————————————————————————————————
But other times, using exactly the same input, it works just fine:
NSTEP ENERGY RMS GMAX NAME NUMBER
1 -4.4889E+05 1.7372E+01 1.1381E+02 C 218
BOND = 1666.4578 ANGLE = 4854.3626 DIHED = 7029.8072
VDWAALS = 59875.1546 EEL = -554800.0321 HBOND = 0.0000
1-4 VDW = 1906.7559 1-4 EEL = 30573.8373 RESTRAINT = 0.0700
EAMBER = -448893.6567
DV/DL = 79.2240
NMR restraints: Bond = 0.000 Angle = 0.000 Torsion = 0.070
===============================================================================
Softcore part of the system: 25 atoms, TEMP(K) = 0.00
SC_Etot= 0.0000 SC_EKtot= 0.0000 SC_EPtot = 79.4772
SC_BOND= 1.7211 SC_ANGLE= 13.7642 SC_DIHED = 18.4959
SC_14NB= 2.4757 SC_14EEL= 20.1833 SC_VDW = -0.0318
SC_EEL = 22.8688
SC_RES_DIST= 0.0000 SC_RES_ANG= 0.0000 SC_RES_TORS= 0.0000
SC_EEL_DER= 56.6513 SC_VDW_DER= -1.0518 SC_DERIV = 55.5995
———————————————————————————————————————
Note that in both cases the minimization seemingly finishes, printing the usual timing information at the end of the output file.
How can two identical inputs give these very different outputs? Can I trust the results where the energies look reasonable?
Best,
Matthew
Best,
Matthew
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sun Nov 27 2022 - 07:00:02 PST