Re: [AMBER] QM method used for force constants derivation in GAFF2 ?

From: Marek Maly <marek.maly.ujep.cz>
Date: Fri, 27 Jan 2017 01:02:29 +0100

Hello Junmei,

first of all thanks a lot for complex answer !

Nevertheless I have some more comments/questions and I would be grateful
for your additional comments.

#1
As we agreed GAFF2 is a "bit less" compatible with the actual AMBER
protein, lipid, nucleid acid
force fields, than GAFF, which is clearly connected with the quite
different way of parameters (at least some) generation. So as this
situation is not ideal for simulations of mixed systems (drug/protein,
drug/lipid, drug/nucleid acid), would be probably good idea to
reparameterize in the future also the AMBER bio-force-fields using the
same strategy which was used for GAFF2, especially if extensive evaluation
of GAFF2 ff will in many concrete systems show better results (comparing
with experiment) than actual version of GAFF. Do you agree ?

#2
What is your opinion regarding mixing of GAFF2 with the actual AMBER
bio-force fields ? I think that here the most important is, how much are
given force fields compatible in nonbond terms parameterisation (vdw,
electrostatic) and this point is probably OK here.
I did not checked this in detail but for example vdw terms for c3 are
pretty similar in both ff.

GAFF
    c3 1.9080 0.1094 OPLS
GAFF2
    c3 1.9069 0.1078

You wrote that you are doing a set of evaluation for GAFF2 now, so I
assume that part of this evaluation was for sure devoted to simulation of
mixed systems drug/(biomolecule) where differences in using GAFF2 or GAFF
(for drug) was carefully analyzed (conformations of the drug on the
binding site, conformation of molecular complex, binding energy) and
eventually compared with experiment.

It is clear to me, that you cannot say here any details about those
evaluation results before they are published, but can you at least tell
us, just as far as you know now, if using of GAFF2 in combination with the
actual AMBER bio-force fields is really safe ? I mean if you, based on
your actual knowledge, can recommend it or if you rather recommend to use
the latest version
(1.8) of GAFF ?

#3
This change of parameterization strategy unfortunatelly creates some
incompatibility problem (I hope not so big) also when one would like to
add new parameters for new atoms/atom types to GAFF2 using paramfit
routine as it only allows to fit using energies or forces. I think that
more compatible will be now (to add new parameters to GAFF2) to use here
forces fitting which is unfortunately still available only for commercial
gaussian outputs.
What is your opinion here ? Which are plans here for the near future ?
Some extension to use QM vibration frequencies for fitting or at least to
make accessible force fitting for those who uses free GAMESS instead
GAUSSIAN ? BTW you wrote "The ab initio frequencies were obtained at
B3LYP/6-31G* level." To be frank I would expect a little more accurate
approach, but I am definitely not specialist here and you probably found
that this is quite enough comparing to experimental frequencies where
available. But my question is, if one uses paramfit to add new parameters
it is automatically the best choice also B3LYP/6-31G* or if I use a bit
more precise theory like MP2/6-31G** it is also OK or even better ?

#4
Regarding that c3-c3 bond I did some ff searching. Mostly I found K value
close to GAFF value (i.e. arround 300). Here I would like to emphasize
GLYCAM case:

  GLYCAM6j
Cg-Cg 310.0 1.520 Butane (gauche, and trans)

where the K is 310 kcal/mol/A*A (so similar to GAFF and different from
GAFF2 value) in spite
the fact that in AMBER16 manual page 41 we can read:


"The GLYCAM-06 force field has been validated against quantum mechanical
and
experimental properties, including: gas-phase conformational energies,
hydrogen bond energies, and vibrational frequencies; solution-phase
rotamer populations (from NMR data); and solid-phase vibrational
frequencies and crystallographic unit cell dimensions."

So this force field was clearly validated also using "vibrational
frequencies" and still we have here c3-c3 K = 310 which is so much
different from GAFF2 value (232.52) which was derived using vibrational
frequencies fitting. Do you have any explanation here ?

On the other hand I found that in CHARMM they are very close to GAFF2 K
value:

CHARMM36
-----------------------
CARBOHYDRATES
CC311 CC311 222.50 1.500 ! par22 CT1 CT1 ( CC311 generic acyclic
CH carbon)
CC312 CC312 222.50 1.485 ! adm 11/08, glycerol ( CC312 CH carbon
in linear polyols)
CC321 CC321 222.50 1.530 ! par22 CT2 CT2 ( generic acyclic CH2
carbon (hexopyranose C6) )


PROTEINS
CT2 CT2 222.500 1.5300 ! ALLOW ALI (CH2)
CT3 CT2 222.500 1.5280 ! ALLOW ALI (CH3)

NUCLEIC ACIDS
CN8 CN8 222.50 1.528 !Alkanes, sacred (Nucleic acid carbon
(equivalent to protein CT2))
CN8 CN9 222.50 1.528 !Alkanes, sacred (Nucleic acid carbon
(equivalent to protein CT3))

GENERAL FF
CG321 CG321 222.50 1.5300 ! PROT alkane update, adm jr., 3/2/92


--------------------------

And for example this sentence from the basic CHARMM article:

"Bond, valence angle, Urey-Bradley, improper and torsion
force constants are based on MP2/6-31G(d) vibrational
spectra."

clearly explain this similarity. To be frank after I found this I started
to trust a little more in GAFF2 than before as CHARMM is for sure not a
bad force field :))

But anyway here is nicely seen compatibility of the equivalent atomtypes
across the all CHARMM specialised ff (carbohydrates, proteins, nucleid
acids, general ff) which was so nicely seen also in AMBER before GAFF2
appeared :))


#5
It is "very nice" that in Amber manual (page 33) is recommendation to use
  GAFF2 for organic molecules/ligands, but to be frank I did not find any
information about this new
  force field in manual.

  In chapter "15.6.2. New Development of GAFF" there are clearly described
just some last corrections to GAFF.

  So in my opinion would be appropriate to add here at least some short
info similar to that which is
  written at the end of the gaff2.dat file, which I additionally found just
by accident.

So thank you in advance very much for your eventual additional comments.

   Best wishes,

       Marek





Dne Thu, 26 Jan 2017 19:04:30 +0100 Junmei Wang <junmwang.gmail.com>
napsal/-a:

> In GAFF, the bond stretching and bond angle bending force constants come
> from protein/nucleic acid force fields (FF94, FF99) whenever possible.
> For
> those not having any counterparts in FF94/FF99, we used a set of
> empirical
> formulae (please refer to the GAFF paper) to calculate the force
> constants. The parameters of those formulae were derived to reproduce the
> existing force constants in FF94/FF99.
>
> However, in GAFF2, we adopted a different philosophy to derive those
> parameters (26 in total). We optimized those parameters to minimize the
> differences between the calculated MM frequencies and the ab initio ones
> (the training set has 572 small molecules and 22407 frequencies in
> total).
> We also tested the best parameter set with a large test set (711
> molecules,
> 29000 frequencies). The ab initio frequencies were obtained at
> B3LYP/6-31G*
> level. We could reduce the RMS errors from 120 cm-1 for GAFF to 60 cm-1
> for
> both the training and test sets.
>
> We are now doing a set of evaluation for GAFF2, which will be a part of
> the
> GAFF2 paper. I agree with you that GAFF is more consistent with the
> current
> protein/nucleic acid force fields. And the difference between GAFF and
> GAFF2 for c3-c3 is very big.
>
> Hopefully this can help you to choose a proper version of GAFF force
> field
> in your study.
>
> Junmei
>
> On Tue, Jan 24, 2017 at 7:24 PM, Marek Maly <marek.maly.ujep.cz> wrote:
>
>> Thank you !
>>
>> BTW that difference in c3-c3 force constant GAFF vs GAFF2 is really big
>> 300.9 vs 232.52 I would say. To be frank I am a bit confused with such a
>> big change moreover in such basic/common atomtype bonding term.
>> This indicates that methodology and/or set of molecular fragments which
>> were used for fitting in GAFF2 case had to be quite different from
>> methodology or mol. fragments used in GAFF case.
>>
>> I am also a bit confused with the fact that the same term in actually
>> recommended protein force field ff14SB has still force constant close to
>> old GAFF value:
>>
>> CT-CT 310.0 1.526 JCC,7,(1986),230; AA, SUGARS
>>
>> So old GAFF seems to me to be a more compatible with ff14SB from this
>> point of view than with the new GAFF2, which is a bit strange or not ?
>>
>> BTW the same trend holds if we compare here GAFF/GAFF2 with LIPID14 ff.
>>
>> cA-cA 303.1 1.5350 Lipid11 v1.0 (GAFF c3-c3)
>> cD-cD 303.1 1.5350 Lipid14 v2.0 (GAFF c3-c3)
>>
>> again we are much closer here to GAFF values.
>>
>> The same trend holds for vdW terms:
>>
>> GAFF
>> c3 1.9080 0.1094 OPLS
>> GAFF2
>> c3 1.9069 0.1078
>> FF14SB (parm10)
>> CT 1.9080 0.1094 Spellmeyer
>> LIPID14
>> cA 1.9080 0.1094 OPLS
>> cD 1.9080 0.1094 OPLS
>>
>>
>> So it seems that if one is going to simulate some heterogenous system
>> (e.g. protein + drug) it should be more suitable to use still old GAFF
>> (not the new GAFF2) as GAFF is clearly more compatible
>> with bio-force-fields than GAFF2 or am I wrong here ?
>>
>>
>> Best wishes,
>>
>> Marek
>>
>>
>>
>> Dne Tue, 24 Jan 2017 19:44:34 +0100 David Case <david.case.rutgers.edu>
>> napsal/-a:
>>
>> > On Tue, Jan 24, 2017, Marek Maly wrote:
>> >>
>> >> I have technical question regarding QM methods used in GAFF2
>> >> development.
>> >
>> > cc-ing to Junmei Wang, who doesn't always follow the mailing list.
>> Note
>> > that
>> > you may need to wait for the paper on GAFF2 to be published...
>> >
>> > ...dac
>> >
>> >>
>> >> In Amber16 manual, page 286, there is written:
>> >>
>> >> "We have performed B3LYP/6-31G* optimization for 15 thousands
>> >> marketed or
>> >> experimental ..."
>> >>
>> >>
>> >> So B3LYP/6-31G* is here QM method used to obtain optimized molecular
>> >> structures (low energy
>> >> conformations) from which equilibrium geometrical terms (eq. bond
>> >> lengths,
>> >> eq. angles) were obtained
>> >> and so in some cases old GAFF values were updated.
>> >>
>> >>
>> >> But in GAFF2, comparing to GAFF, changed clearly also some force
>> >> constants.
>> >> As an example I can mention c3-c3 bond:
>> >>
>> >>
>> >> In GAFF
>> >> c3-c3 300.9 1.5375 SOURCE1_SOURCE5 88072 0.0058
>> >>
>> >> In GAFF2
>> >> c3-c3 232.52 1.538 SOURCE1_SOURCE5 88072 0.0058
>> >>
>> >>
>> >> My question is, which QM methodology was used to derive (perhaps
>> using
>> >> paramfit ?) new GAFF2 bond, angle or dihedral force constants ?
>> >>
>> >> I assume, that since here we need just to calculate for each
>> >> conformation
>> >> only it's
>> >> energy (no optimization) and since we also need to obtain reliable
>> >> force
>> >> constants (hence we need accurate QM energies) some more accurate
>> method
>> >> was probably used like MP2/6-31G** , MP3/6-31G** etc. to derive QM
>> >> energies. Am I right ?
>> >>
>> >> So thank you in advance for QM methodology specification used in
>> GAFF2
>> >> for
>> >> force constants
>> >> derivation.
>> >>
>> >> Best wishes,
>> >>
>> >> Marek
>> >>
>> >
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>> >
>> > __________ Informace od ESET NOD32 Antivirus, verze virove databaze
>> > 14822 (20170124) __________
>> >
>> > Tuto zpravu proveril ESET NOD32 Antivirus.
>> >
>> > http://www.eset.cz
>> >
>> >
>> >
>>
>>
>> --
>> Tato zpráva byla vytvořena převratným poštovním klientem Opery:
>> http://www.opera.com/mail/
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber


-- 
Tato zpráva byla vytvořena převratným poštovním klientem Opery:  
http://www.opera.com/mail/
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu Jan 26 2017 - 16:30:02 PST
Custom Search