Re: [AMBER] Parametrization of Virtual sites in Halogens from David Cerutti on 2021-03-05 (Amber Archive Mar 2021)

From: David Cerutti <dscerutti.gmail.com>
Date: Fri, 5 Mar 2021 10:28:51 -0500

You're in luck! The next version of Amber is going to have all the virtual
site types that mdgx can deal with, printable into topology files that
pmemd can read. Probably by the time you are ready with a model, the new
Amber will be out and you will then be able to run your models in
pmemd.cuda at full speed.

So, with that, what you should do is find some way to make Gaussian
cubegen-format files with the molecular conformation and the detailed
electrostatic potential grid (if what you like is Gaussian). The mdgx
program can kind of help you do this, but I've kind of left the user to
their own devices unless they want IPolQ charges (again, the upcoming
version of Amber will fix some of this, but please don't wait for that just
to get going on your electrostatic potential calculations). You will need
a few different conformations of your molecule, unless it's just plain
rigid like some of those SAMPL biaryls.

With your Gaussian cubegen format files, you have the locations of actual
atoms and the electrostatic potential that they should produce. Now you
need an Amber topology file that describes your system. That's an
antechamber task, but I suspect you have that under control. If you want
to experiment with generating different conformations of your molecule,
perhaps to create additional electrostatic potential maps for other states
of the molecule, see this tutorial (this will work with the currently
distributed version of mdgx): Using mdgx to manipulate small molecules
(ambermd.org) <http://ambermd.org/tutorials/basic/tutorial6/index.php>

To add virtual sites to your molecule, you need to add a &rule file. The
&rule namelist WILL look like this in the new mdgx:

cat > BRRules.ept << EOF
&rule
  atom Br1
  residue LIG
  eptype NONE
  q -0.379221
&end

&rule
  parent Br1
  frame2 C4
  epname V1
  residue LIG
  eptype FD-2
  v1e -0.8
  q 0.2775
&end
EOF

For the current version of mdgx, you may actually need to look in the
VirtualSites.c code file to see what you specify. You still have "parent"
and "frame{1,2,3}" keywords. I think you still have eptype, or maybe it's
just "style," but you will need to specify it by a number rather than the
more intuitive strings like "FD-2" (fixed distance, 2-atom). FD-2 is in
fact a good option for those halogen sigma holes, but until you get the new
mdgx you will use "Flex-2" which we currently call type 1 (fixed distance
with v1e = -0.8 means that the virtual site is always -0.8Angstrom from Br
to C4, which places it out beyond the Bromine atom, whereas flexible
distance v1e = -0.5 would put it in about the same place, negative 1/2 the
Br-C4 bond distance out beyond the Bromine atom). Type 2 is an equivalent
three-atom frame that you could use to build TIP4P water, for example. You
will have frame geometries specified by keywords like v1e. You can look in
the Amber manual for a discussion of using mdgx to add virtual sites--it
will go through all of the keywords in the format that you are used to.
But you can also grep for these keywords in the codebase to see where they
are parsed and thus get an idea for what values it will accept. You can
also alter the non-bonded parameters of an existing atom by saying it's
eptype 0 (not "NONE", that is the intuitive string that you'll get in the
upcoming mdgx). You can thereby give it "q" and a value to alter its
charge. The &rule namelists above are designed to alter bromobenzene into
something with a sigma hole virtual site.

Now, how to get virtual sites onto your molecule? Well, think of this
&rule file as part of your topology for now (because in the current version
of mdgx, that's all it is--the topology will get read, then this rule file
will get read, and mdgx will go through applying each of the &rules to
whatever residues and named atoms match. It will modify the topology
inside the program's memory to include new particles and the appropriate
frames. This is done by including the file with the -xpt keyword on the
command line or, as I prefer, in the &files namelist. It's like "mdgx -O
-i mdin -c inpcrd -p prmtop -xpt YourRules.txt" if that seems familiar.
The &files namelist is like this:

cat > mdin << EOF
&files
  -p prmtop
  -c inpcrd
  -xpt YourRules.txt
&end

&cntrl
   cut = 9.0, temp0 = 300.0, ntt = 3,
   nstlim = 1000, ntpr = 50, ntwx = 500,
&end
EOF

Then you would just do mdgx -O -i mdin and it would read prmtop, dress it
up with virtual sites, and then do the dynamics (at an abysmally slow CPU
PME pace). When it prints the trajectory, the virtual sites won't even be
part of the trajectory, but they will be part of the dynamics. For
instance, you can turn TIP3P water into TIP4P-Ew and observe that the bulk
properties come out to the TIP4P-Ew results over a few ns of simulation
(the density of each water model is in fact slightly different, as are
things like heat capacity and diffusivity). The next release of Amber will
let you write topologies with mdgx that include the virtual sites, which
can then be read by pmemd and pmemd.cuda to run at full speed on GPUs. It
will be done with mdgx input files like this (refencing BRRules.ept from
above):

&files
  -p BrBenzene.prmtop
  -c BrBenzene.inpcrd
  -xpt BrRules.ept
&end

&parmedit
  newinpcrd BrBenzeneEP.prmtop
  newprmtop BrBenzeneEP.inpcrd
&end

The &parmedit namelist is very short, not many keywords, but as with all
things mdgx you can run mdgx -YOUR_NAMELIST_OF_INTEREST on the command line
(namelist in all caps) and it will print out information matching what's in
the manual on what keywords are available. The new inpcrd will provide
placements of your EPs so that you can see what you made and visually check
that it is what you want.

So, recapping, you can get started right now parameterizing your virtual
sites and you will be able to make topologies with them to run with the
full power of AmberMD with the release of Amber21. The only thing now is
to understand how to fit those virtual sites. You will make a &rules file
as before with the frame types and frame geometries that you want (the
frame style, value of v1e, for example). Give them charges of zero to
begin with. The important thing is that you construct the virtual sites.
Then, you will do some mdgx charge fitting exercises akin to what is
here: Implicitly
Polarized Charge Tutorial (ambermd.org)
<http://ambermd.org/tutorials/advanced/tutorial28/index.php>. Much of that
tutorial will be unnecessary for standard RESP charges. You will specify
resp rather than ipolq in the fitq namelist and give it only your one
Gaussian cubegen file of the molecule's electrostatic potential in vacuum.
But the rest is similar. With -ept YourRules.txt in the
&files namelist and a topology that fits with the atoms in your Gaussian
cubegen file, mdgx will extract the coordinates of atoms from the cubegen
file (it knows the various Bohr / Angstrom conversions in needs to do) and
dress it up with your extra points before doing the RESP fitting. You can
then read the optimized charges for your extra points out of the resulting
mdout / YourOutput.txt file. For years, what I did to run simulations like
you saw in that SAMPL paper was to take those fitted charges, copy them
into the YourRules.txt file, and then feed it into an mdgx simulation.
But, again, soon you'll be able to save your results in a more permanent
fashion and use pmemd!

As I've been implying, the syntax of your &rules file will be changing a
bit, but the new program has all of the old virtual site types and more so
you won't lose any work. I'm a big fan of virtual sites, and you can add
as many as you want (within reason, the new pmemd may crash if you try to
specify more than 2-3 per atom all over a big protein). But you should be
able to play a lot with the different frames once you get the hang of it,
and I'd expect your errors in reproducing the ESP (see the mdout after the
charge fitting is done) to go down about 25% with a single sigma hole and
up to 40% if you apply them more generously.

HTH,
Dave

On Fri, Mar 5, 2021 at 4:41 AM Stefano Motta <stefano.motta.unimib.it>
wrote:

> Dear Prof. Cerutti
>
> I read your work entitled "*Alchemical prediction of hydration free
> energies for SAMPL"* and I found it very inspiring. We are working with
> chlorinated organic molecules, and I would like to include virtual sites on
> halogens, as proposed in your work. Could I have more details about the
> process used to obtain the charges with RESP? I computed the electrostatic
> potential with Gaussian, but how can I insert virtual sites that are
> recognized as additional charge points with RESP?
> I first tried to contact Prof. Mobley, but he told me to contact you as
> you did this part of the work.
>
> Thank you in advance for your availability,
>
> Sincerely,
>
> Stefano Motta
> --
> ________________________________________________________________
>
> Stefano Motta PhD
>
> Email: stefano.motta.unimib.it <s.motta17.campus.unimib.it>
> *Field of study:* Molecular Modeling, and Proteins Molecular Dynamics
>
> Università degli Studi di Milano Bicocca
> Department of Earth and Environmental Sciences
>
> Piazza dell'Ateneo Nuovo, 1 - 20126, Milano (Italy).
> _________________________________________________________________
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Mar 05 2021 - 07:30:02 PST