Re: [AMBER] Different results for different computers

From: Robert Duke <rduke.email.unc.edu>
Date: Sat, 31 Oct 2009 18:56:28 -0400

Identical runs on a uniprocessor should be identical, provided you are not
using fftw (which has initialization code which may select different run
algorithms based on very slight differences in timing caused by operating
system indeterminacy). If you use different machines or different compilers,
results will be different for a host of reasons, though at least the first
100 steps should be pretty much the same. If you use mpi (parallel runs) and
the same h/w and s/w, then runs will diverge (start showing differences in
the last decimal place of mdout printout, with the differences gradually
increasing) due to "network indeterminacy" caused by different completion
order of things like distributed force summations, which causes differences
in rounding error between different runs (there are also issues caused by
loadbalancing that introduce additional differences in rounding error). I
have discussed all this stuff exhaustively on the list, so looking at the
mail reflector will turn up longer discussions on this topic. The typical
explanatory "all is okay" phrase is to say that you are just exploring
different parts of phase space for the system, and one really should be
looking at large averages for results. And as Jason says, md trajectories
are basically chaotic, so slight differences in results rather quickly get
amplified. Further reality is that while the precision of the computations
done is high, this is after all the result of a parameterized forcefield
with some rather high errors in the parameterization of bonds, angles,
dihedrals, and a couple of nonbonded interaction approximations (vdw, and
point charge electrostatics), so the calculations are not exactly accurate
to half a dozen decimal places...
Regards - Bob Duke

----- Original Message -----
From: "Jason Swails" <jason.swails.gmail.com>
To: "AMBER Mailing List" <amber.ambermd.org>
Sent: Saturday, October 31, 2009 3:57 PM
Subject: Re: [AMBER] Different results for different computers


> Proteins are a chaotic system, so once a tiny difference exists, it will
> compound very very quickly. Therefore, once the simulations diverge
> slightly (at step 400) the simulations will become drastically different
> after a small amount of time. Both simulations are valid, you don't have
> to
> discard either. You can see similar differences on the same machine using
> different compilers as well (sometimes).
>
> In fact, if you start from exactly the same starting point with exactly
> the
> same inputs on the same machine with the same installation, and change the
> random seed slightly, the resulting simulations will diverge on the
> picosecond timescale. This behavior is to be expected.
>
> All the best,
> Jason
>
> On Sat, Oct 31, 2009 at 3:47 PM, manoj singh <mks.amber.gmail.com> wrote:
>
>> The time average properties (energies etc) of the equliberated (relaxed)
>> trajectories should be almost same, not the energies some particular
>> snapshot.
>>
>> On Sat, Oct 31, 2009 at 3:39 PM, cyk5056 <cyk5056.163.com> wrote:
>>
>> > Hi Amberusers,
>> >
>> > I am doing MD by amber using the following .in file:
>> >
>> > &cntrl
>> > imin=1, maxcyc=1000, ncyc = 500,
>> > cut=10.0,
>> > ntpr=100, ntx=1, ntb=1, ntr = 1,
>> > restraint_wt=5.0,
>> > restraintmask=':1-108'
>> > &end
>> > The output rst files do not match with each other between my own
>> > computer
>> > and computer cluster of the university. Both of them are running under
>> Amber
>> > 10 with exactly the same input files. I checked the out file, the
>> difference
>> > occurs after NSTEP=400.
>> > Output file of my own computer:
>> > ...
>> > NSTEP ENERGY RMS GMAX NAME
>> > NUMBER
>> > 300 -3.9166E+04 1.4845E+00 1.6191E+02 CG 386
>> > BOND = 2445.9827 ANGLE = 162.1399 DIHED =
>> > 994.8353
>> > VDWAALS = 4227.8494 EEL = -53419.7262 HBOND =
>> > 0.0000
>> > 1-4 VDW = 364.8890 1-4 EEL = 5964.5484 RESTRAINT =
>> > 93.4892
>> > EAMBER = -39259.4815
>> > NSTEP ENERGY RMS GMAX NAME
>> > NUMBER
>> > 400 -3.9783E+04 1.2726E+00 1.2005E+02 CG 386
>> > BOND = 2546.5837 ANGLE = 162.1376 DIHED =
>> > 994.4480
>> > VDWAALS = 4663.5487 EEL = -54574.8114 HBOND =
>> > 0.0000
>> > 1-4 VDW = 364.2400 1-4 EEL = 5962.6647 RESTRAINT =
>> > 98.2829
>> > EAMBER = -39881.1885
>> > ...
>> > computer cluster:
>> > ...
>> > NSTEP ENERGY RMS GMAX NAME
>> > NUMBER
>> > 300 -3.9166E+04 1.4845E+00 1.6191E+02 CG 386
>> > BOND = 2445.9827 ANGLE = 162.1399 DIHED =
>> > 994.8353
>> > VDWAALS = 4227.8494 EEL = -53419.7262 HBOND =
>> > 0.0000
>> > 1-4 VDW = 364.8890 1-4 EEL = 5964.5484 RESTRAINT =
>> > 93.4892
>> > EAMBER = -39259.4815
>> > NSTEP ENERGY RMS GMAX NAME
>> > NUMBER
>> > 400 -3.9783E+04 1.2727E+00 1.2004E+02 CG 386
>> > BOND = 2546.5825 ANGLE = 162.1358 DIHED =
>> > 994.4481
>> > VDWAALS = 4663.5314 EEL = -54574.7675 HBOND =
>> > 0.0000
>> > 1-4 VDW = 364.2398 1-4 EEL = 5962.6647 RESTRAINT =
>> > 98.2828
>> > EAMBER = -39881.1652
>> > ...
>> >
>> > After NSTEP=400, everything of them is different.
>> > I searched the archive of Amber, some one says that 32-bit and 64-bit
>> > computer will differ. That is partially true for me because cluster is
>> using
>> > 64-bit and I am using 32-bit. I am trying to let the cluster ran under
>> > 32-bit to see if this is the reason. However, they said that is minor
>> error.
>> > That is not true for me. This step is just a starting point. After
>> several
>> > simulations based on the results here, the final results are quite
>> > different. Which one should I trust? And how to eliminate this problem?
>> >
>> > Thank you very much!!
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
>
> --
> ---------------------------------------
> Jason M. Swails
> Quantum Theory Project,
> University of Florida
> Ph.D. Graduate Student
> 352-392-4032
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
>



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Sat Oct 31 2009 - 16:00:02 PDT
Custom Search