Dear users:
Data corruption seems final, so I am backing up to a previous simulation
segment, but I thought I'd report this in case it is useful to anybody.
I am running Amber 16 pmemd on 2 GPUs, using a charmm force field with a
topology built in gromacs and then ported to amber with parmed. I've run
hundreds of microseconds without seeing this type of issue, so my guess is
that it is not specific to the system or the forcefield. At the moment, I'm
suspecting that it is one of those rare things that would have been caught
by ECC if I was using a GPU that supported it (which I am not).
During an attempt to restart my simulation, pmemd gives the error:
| ERROR: NaN(s) found in input coordinates.
This likely means that something went wrong in the previous
simulation.
And the command:
ambpdb -p this.prmtop -c v0.5_5.rst
produces output that contains this obvious problem:
ATOM 54207 HW1 SOL 3056 5.239 35.425 112.718 1.00 0.00
H
ATOM 54208 HW2 SOL 3056 5.923 36.752 112.464 1.00 0.00
H
ATOM 54209 OW SOL 3057 78.934 30.200 23.018 1.00 0.00
O
ATOM 54210 HW1 SOL 3057 78.261 30.829 23.279 1.00 0.00
H
ATOM 54211 HW2 SOL 3057 79.017 30.317 22.071 1.00 0.00
H
ATOM 54212 OW SOL 3058 -nan -nan -nan 1.00 0.00
O
ATOM 54213 HW1 SOL 3058 -nan -nan -nan 1.00 0.00
H
ATOM 54214 HW2 SOL 3058 -nan -nan -nan 1.00 0.00
H
ATOM 54215 OW SOL 3059 49.273 109.879 40.039 1.00 0.00
O
ATOM 54216 HW1 SOL 3059 48.566 109.877 39.394 1.00 0.00
H
ATOM 54217 HW2 SOL 3059 50.056 109.653 39.537 1.00 0.00
H
ATOM 54218 OW SOL 3060 52.061 48.796 41.712 1.00 0.00
O
ATOM 54219 HW1 SOL 3060 51.608 49.476 41.214 1.00 0.00
H
ATOM 54220 HW2 SOL 3060 52.978 49.072 41.715 1.00 0.00
H
Looking back at the previous segment of simulation, I can see where the
Etot term popped from a real number to NaN:
NSTEP = 30750000 TIME(PS) = 2204999.989 TEMP(K) = 309.78 PRESS =
0.0
Etot = -239957.1300 EKtot = 104187.3359 EPtot =
-344144.4659
BOND = 7635.7078 ANGLE = 26341.4448 DIHED =
23765.9225
UB = 10045.0177 IMP = 325.3101 CMAP =
-176.9250
1-4 NB = 3598.1690 1-4 EEL = -35454.2960 VDWAALS =
13061.4299
EELEC = -393286.2467 EHBOND = 0.0000 RESTRAINT =
0.0000
EKCMT = 0.0000 VIRIAL = 0.0000 VOLUME =
1544909.6127
SURFTEN =
0.0000
Density =
1.0107
------------------------------------------------------------------------------
NSTEP = 31000000 TIME(PS) = 2205999.989 TEMP(K) = 311.05 PRESS =
0.0
Etot = -240282.7050 EKtot = 104614.0859 EPtot =
-344896.7909
BOND = 7570.3922 ANGLE = 26289.0463 DIHED =
23693.8946
UB = 9872.5247 IMP = 331.2119 CMAP =
-183.0812
1-4 NB = 3607.1803 1-4 EEL = -35542.7377 VDWAALS =
13268.2434
EELEC = -393803.4653 EHBOND = 0.0000 RESTRAINT =
0.0000
EKCMT = 0.0000 VIRIAL = 0.0000 VOLUME =
1546525.2111
SURFTEN =
0.0000
Density =
1.0097
------------------------------------------------------------------------------
NSTEP = 31250000 TIME(PS) = 2206999.989 TEMP(K) = NaN PRESS =
0.0
Etot = NaN EKtot = NaN EPtot =
-344391.5063
BOND = 7571.4110 ANGLE = 26315.8830 DIHED =
23781.7767
UB = 9897.5359 IMP = 329.2852 CMAP =
-170.7033
1-4 NB = 3577.8462 1-4 EEL = -35569.5495 VDWAALS =
12988.7025
EELEC = -393113.6939 EHBOND = 0.0000 RESTRAINT =
0.0000
EKCMT = 0.0000 VIRIAL = 0.0000 VOLUME =
1545119.9333
SURFTEN =
0.0000
Density =
1.0106
------------------------------------------------------------------------------
NSTEP = 31500000 TIME(PS) = 2207999.989 TEMP(K) = NaN PRESS =
0.0
Etot = NaN EKtot = NaN EPtot =
-344914.4568
BOND = 7521.9425 ANGLE = 26355.6523 DIHED =
23813.2940
UB = 10004.2951 IMP = 324.9560 CMAP =
-194.2010
1-4 NB = 3606.3528 1-4 EEL = -35201.7173 VDWAALS =
12831.8700
EELEC = -393976.9012 EHBOND = 0.0000 RESTRAINT =
0.0000
EKCMT = 0.0000 VIRIAL = 0.0000 VOLUME =
1541838.9378
SURFTEN =
0.0000
Density =
1.0127
#########################
My run parameters are:
A NPT simulation for common production-level simulations -- params
generally from Charmm-gui + some modifications by CN
&cntrl
imin=0, ! No minimization
irest=1, ! ires=1 for restart and irest=0 for new start
ntx=5, ! ntx=5 to use velocities from inpcrd and ntx=1 to not
use them
ntb=2, ! constant pressure simulation
! Temperature control
ntt=3, ! Langevin dynamics
gamma_ln=1.0, ! Friction coefficient (ps^-1)
temp0=310.0, ! Target temperature
tempi=310.0, ! Initial temperature -- has no effect if ntx>3
! Potential energy control
cut=12.0, ! nonbonded cutoff, in Angstroms
fswitch=10.0, ! for charmm.... note charmm-gui suggested cut=0.8 and
no use of fswitch
! MD settings
nstlim=250000000, ! 0.25B steps, 1 us total
dt=0.004, ! time step (ps)
! SHAKE
ntc=2, ! Constrain bonds containing hydrogen
ntf=2, ! Do not calculate forces of bonds containing hydrogen
! Control how often information is printed
ntpr=250000, ! Print energy frequency
ntwx=250000, ! Print coordinate frequency
ntwr=500000, ! Print restart file frequency
! ntwv=-1, ! Uncomment to also print velocities to trajectory
! ntwf=-1, ! Uncomment to also print forces to trajectory
ntxo=2, ! Write NetCDF format
ioutfm=1, ! Write NetCDF format (always do this!)
! Wrap coordinates when printing them to the same unit cell
iwrap=1,
! Constant pressure control. Note that ntp=3 requires barostat=1
barostat=2, ! Berendsen... change to 2 for MC barostat
ntp=3, ! 1=isotropic, 2=anisotropic, 3=semi-isotropic w/ surften
pres0=1.01325, ! Target external pressure, in bar
taup=4, ! Berendsen coupling constant (ps)
comp=45, ! compressibility
! Constant surface tension (needed for semi-isotropic scaling).
Uncomment
! for this feature. csurften must be nonzero if ntp=3 above
csurften=3, ! Interfaces in 1=yz plane, 2=xz plane, 3=xy plane
gamma_ten=0.0, ! Surface tension (dyne/cm). 0 gives pure semi-iso
scaling
ninterface=2, ! Number of interfaces (2 for bilayer)
! Set water atom/residue names for SETTLE recognition
watnam='SOL', ! Water residues are named TIP3
owtnm='OW', ! Water oxygens are named OH2
hwtnm1='HW1',
hwtnm2='HW2',
&end
&ewald
vdwmeth = 0,
&end
##################
and I run like this:
export CUDA_VISIBLE_DEVICES=0,1
{
echo "rank 0=localhost slot=0:0"
echo "rank 1=localhost slot=0:1"
} > my.rankfile.A
mpirun --report-bindings --rankfile my.rankfile.A -np 2
${AMBERHOME}/bin/pmemd.cuda.MPI -i $amdp -o ${athis}.out -p this.prmtop -c
${aprev}.rst -r ${athis}.rst -x ${athis}.mdcrd -inf ${athis}.info -l
${athis}.log
##################
Thank you,
Chris.
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Sep 12 2017 - 09:30:03 PDT