Re: [AMBER] CpHMD – Effect of explicit ions when calculating the pKa of model compounds

From: Jason Swails <>
Date: Wed, 8 Feb 2017 22:19:24 -0500

Hi Eric,

I don't check the Amber list nearly as often as I used to and only just saw
this question.

On Tue, Feb 7, 2017 at 8:22 AM, Eric Lang <> wrote:

> Hello,
> This post is partly a reply to my previous post but also extends on it.
> I decided to test more carefully the effect of having explicit Na+ and Cl-
> ions at a concentration of 0.1 M with the model compounds for Lys, His, Glu
> and Cys (shamelessly the titrable residues I am currently interested in).
> To do this, I prepared the systems ACE-X-NNE (where X is the titrable
> residue) and solvated them in a 20 A TIP3P solvent buffer and added 7 or 6
> explicit Na+ and Cl- ions (depending on the number of water molecules added
> to solvate the residues of interest). After minimisation, heating and
> equilibration, I ran a 10 ns CpHMD simulation at pH=pKa and then pH-REMD at
> pH = pKa ± 0.1 pKa ± 0.2 pKa ± 1.2 (as in the Jason’s paper on explicit
> solvent pH-REMD) for 20 ns. Results between CpHMD and pH-REMD are
> consistent and here I will just show the pH-REMD results.

​The first thing I would have suggested is precisely this :). In explicit
solvent especially, pH-REMD is absolutely critical to breaking out of local
minima and sampling phase space. This is discussed to some extent in the
original paper. Basically, the only way to get the ion to "let go" of the
titratable site is to let the configuration diffuse to a lower pH to force
protonation and then diffuse back up after the Na+ diffuses away (or vice
versa for Cl-). Another way of thinking about this is that large barriers
in configuration space sampling can be circumvented by traversing state
space (the thermodynamic state here being solution pH) in which the
barriers are much lower.

So this is consistent with what I've observed.

> Basically, I do not see significant differences in the calculated pKa
> whether I add explicit ions or not (_salt here means in the presence of 0.1
> M of explicit NaCl):
> Lys Lys_salt Cys Cys_salt Glu Glu_salt His
> His_salt
> pKa 10.35 10.38 8.35 8.34 4.82 4.82 6.31 6.31
> n 1.01 1.00 1.02 1.01 0.96 0.97 1.01
> 1.01
> I do find it a bit surprising as, after all, I would have expected that the
> presence of ions interacting with the titrable residues would make some
> differences as my early results suggested, nonetheless these much more
> thorough simulations suggest the effect is negligible.

​As Vinicius said, the only affect that the explicit ions could *possibly*
have on the calculated pKa is by modifying the configuration ensemble.
When you run short simulations that get stuck in a local minimum, these
changes can be pronounced for the model compounds (which is why you saw
large differences in your first simulations). When you sample more
completely, these differences largely disappear.

Please note that I did not include error bars by either dividing the
> pH-REMD trajectories into small chunks or by running multiple simulations.
> However, the good agreement between the predicted pKa values obtained from
> pH-REMD and from CpHMD (the values agree at a maximum of ± 0.03 pK units)
> suggests that dividing the pH-REMD into chunks would lead to similar
> variations.
> However, I am now a bit concerned by the shift between calculated and
> theoretical pKa, especially in the case of Glu (4.8 calculated instead of
> the 4.4 it has been parametrised for), and to a lesser extend His (6.3 vs.
> 6.5). Are those kinds of deviations within the accuracy of the method? I
> would have expected deviations of no more than ± 0.1 pK units.

A very good level of accuracy for CpHMD is to be within 0.8-0.9 pK units
(although for the model compounds it should be better). The glutamate
result is a bit worrisome, but no exceptionally so. The reference energy
may need some level of correcting (the second set of simulations you ran
are more thorough than the set I used to tweak the reference energies for
explicit solvent), but the correction should not be that large.

Histidine is another story. The 6.5 value stored in was one of
the two sites. The other site has a pKa of 7.1. So the pKa should be
something slightly less than 6.5 (which it is), since when the site with a
higher pKa is protonated, the other site can still be deprotonated with a
small probability.

In the case of Lys, Glu and His, the results are similar between the two
> force fileds, suggesting that ff14SB can be used for constant pH MD.

​This is good to know.

> However, the pKa of Cys is slightly higher with ff10 than with ff14SB,
> could it be somehow because the unprotonated Cys is not defined in the
> standard ff14SB?


> Nonetheless the pKa of Glu is still quite high compared with the
> theoretical one. Could it be a problem with the parametrisation of GL4 in
> amber 16?

​The reference energy may benefit from some adjustment.​

I would really appreciate if you could share your opinion on either or not
> these differences should be considered problematics and if some fine tuning
> of the parameters / reparametrisation would be useful.

​It sounds like fine-tuning the parameters would be helpful. It can
actually be done in a single step. Knowing that the total reference energy

Ref Energy = TI - R*T*ln(10)*pKa

If we want to solve for the "correct" TI energy, you can isolate that by

TI = Ref Energy + R*T*ln(10)*pKa

So in your case, take the Reference energy from the cpin file, add
0.00199*300*log(10)*4.82, and that value is what the TI energy *should*
have been. You can then insert that value back into
where refene2.solvent_energies() is set for the GL4 residue.

Recompile, regenerate your cpin file, and rerun your calculations and you
should see that the pKa comes out more or less perfect.

Hope this helped,

Jason M. Swails
AMBER mailing list
Received on Wed Feb 08 2017 - 19:30:03 PST
Custom Search