Re: [AMBER] Amber2020 and Amber2022 comparability issue: alchemical free energy calc

From: Michael T Kim via AMBER <>
Date: Thu, 23 Feb 2023 10:51:26 -0800

Hi amber community,

Re-opening this thread. We performed additional troubleshooting experiments
(as Prof. Case suggested, we performed short simulations using either
amber20/amber22, starting from a common restart and using the same
explicitly defined ig value on the same GPU hardware). The experimental
design was as follows:
- we tested a TI system with mixed potentials (alchemical transformation
with softcore, lambda = 0.5) and a non-TI system.
- we simulated 1000 steps of each system on either amber20 or amber22 on
the same node, same GPU configuration
- each simulation was performed in duplicate to confirm deterministic
trajectories (e.g. to make sure amber20 vs amber20 yields same results)

We compared .out files using vimdiff

As expected, duplicate simulations of amber20 vs amber20 yielded exact same
results. Duplicate simulations of amber22 vs amber22 also yielded exact
same results.
For the non-TI system, amber20 and amber22 yielded deterministic results.
For the TI system however, amber20 and amber22 did not yield deterministic

Upon closer examination, the only parameter that looked off to us was
"gti_syn_mass" in amber22. According to the Amber22 manual, this should
default to gti_syn_mass = 0, however the default for our simulations was
setting gti_syn_mass = 1. Would be great to hear from the community which
gti_syn_mass parameter is correct to replicate amber20 behavior in amber22.
Of course we will test this ourselves by explicitly setting gti_syn_mass =
0 and seeing if we get deterministic trajectories between amber20 vs.
amber22 for TI systems. Thank you for your expert advice,

mike and skanda

On Mon, Feb 6, 2023 at 1:59 PM Michael T Kim <> wrote:

> Thank you prof. Case and prof. Cheatham,
> In the following, we've included the standard errors (SE, expressed in
> kcal/mol) for the bound and unbound dG simulations (derived from 5
> replicates each)
> system1: 0.71 (amber20, bound SE: 0.46, unbound SE: 0.28), -0.23 (amber22,
> bound SE: 0.33, unbound SE: 0.39), 1.15 (empirical)
> system2: 0.55 (amber20, bound SE: 0.18, unbound SE: 0.03), -0.17 (amber22,
> bound SE: 0.24, unbound SE: 0.18), 0.83 (empirical)
> system3: 2.35 (amber20, bound SE: 0.29, unbound SE: 0.68), -0.05 (amber22,
> bound SE: 0.90, unbound SE: 0.43), 3.15 (empirical)
> system4: 3.96 (amber20, bound SE: 0.93, unbound SE: 1.01), 1.29 (amber22,
> bound SE: 1.16, unbound SE: 0.50), 5.59 (empirical)
> system5: 2.58 (amber20, bound SE: 1.10, unbound SE: 0.67), 3.39 (amber22,
> bound SE: 0.79, unbound SE: 1.13), 5.82 (empirical)
> As prof. Case suggested, our primary concern is whether amber20 and
> amber22 are yielding different results during alchemical free energy
> calculations. The differences between amber predictions and empirical
> benchmarks are secondary for us and can be discussed later.
> We will run a short simulation using either amber20/amber22, starting from
> a common restart, using the same ig value and on the same GPU node (we're
> not sure if the same GPU node is necessary for deterministic
> trajectories... but just in case). After reviewing the mdout's of amber20
> vs amber22 carefully, we will report back.
> mike and skanda
> p.s. good catch on nstlim of 0.7 ns. We probably should've chosen a better
> example script input file. this particular run had been pre-empted on our
> cluster after 4.3 ns and we were running the remaining 0.7 ns. Each of our
> 12 production lambdas are 5 ns.
> On Sun, Feb 5, 2023 at 9:02 AM David A Case via AMBER <>
> wrote:
>> On Fri, Feb 03, 2023, Thomas Cheatham via AMBER wrote:
>> >
>> >> We are struggling to achieve comparable results when performing the
>> same
>> >> alchemical free energy protocols using amber2020 versus amber2022 (our
>> >
>> >...
>> >>
>> >> below results are ddG, expressed in kcal/mol
>> >> system1: 0.71 (amber20), -0.23 (amber22), 1.15 (empirical)
>> >> system2: 0.55 (amber20), -0.17 (amber22), 0.83 (empirical)
>> >> system3: 2.35 (amber20), -0.05 (amber22), 3.15 (empirical)
>> >> system4: 3.96 (amber20), 1.29 (amber22), 5.59 (empirical)
>> >> system5: 2.58 (amber20), 3.39 (amber22), 5.82 (empirical)
>> >
>> >
>> >Never trust a single run. You likely need multiple replicates
>> The original post already indicated that 5 replicates had been made. But
>> it
>> might be useful to know how different the individual runs are from the
>> averages shown above. That would give a bit of a clue about how much the
>> sampling errors might be.
>> The input file had a but of a funny comment:
>> nstlim=700000,!5ns production run
>> nstlim shows a 0.7 ns run, but the comment says 5 ns. We don't know if
>> you
>> made many such runs for each lamda value or not.
>> The key point (I think) is that amber20 and amber22 seem to be giving
>> different results. (The empirical results are so far from either Amber
>> results that I would set them aside for the moment.)
>> One other test: with the same restart file, and fixing the ig value, run
>> parallel short runs with Amber20 vs Amber22, printing every step (ntpr=1).
>> Do you see any differences between the two codes.
>> I'm cc-ing this to Taisung in case he has thoughts about whether any
>> defaults in Amber22 are different from those in Amber20. But carefully
>> compare an mdout file from the two programs (say with vimdiff or some
>> other
>> editor that can look at two files at once.) Look at the output before the
>> first step with an eagle eye, to try to spot anything that might be
>> relevant
>> difference between the two sets of runs.
>> ....regards...dac
>> _______________________________________________
>> AMBER mailing list
AMBER mailing list
Received on Thu Feb 23 2023 - 11:00:02 PST
Custom Search