Re: [AMBER] SPAM: A Simple Approach for Profiling Bound Water Molecules

From: Jason Swails <jason.swails.gmail.com>
Date: Fri, 24 Jan 2014 08:47:17 -0500

On Fri, 2014-01-24 at 11:06 +0100, Sergey Samsonov wrote:
> Dear Jason,
>
> thanks a lot for giving this detailed answer and for providing the
> source code.
>
> I tried to use SPAM within cpptraj with AT13 for AMBER trajectory. Just
> for the purpose of testing I extracted 10 TIP3 water molecules around an
> ion (by closest command). Here is the command I'm using then after
> creating peak.xyz with volmap:
>
> spam peak.xyz solv WAT cut 12 site_size 2.5 bulk -30 summary summary.out
> out datafile.out
>
> The values 12, 2.5 and -30 I took from your paper (in the paper it was
> -30.3 kcal/mol for SPCE, so in my case this is just an arbitrary value,
> and I should check first what it would be for TIP3) .
>
> In general, it looks to be working properly, though I still have several
> questions:
>
> - as an output I get two files: spam.info (with the information about
> frames corresponding to the peaks being occupied) and datafile.out (with
> per frame energetic (?) info corresponding to the peaks). However, I
> don't get a file summary.out created. Does it mean that I wrongly
> specified something in my input or it means that something didn't work
> properly?

Alas no. I simply haven't yet implemented the calculation of the SPAM
free energies from the interaction energies of the individual frames,
although it's on the to-do list. (You are beginning to see why SPAM is
currently undocumented :)). This option is more of a space-holder for a
future implementation. Part of the reason this wasn't implemented
before is that I didn't have ready access to a kernel density estimate
which is what we used in the original paper (and helps make the data
considerably smoother). This is no longer an excuse, though, as the
next version of cpptraj has implemented a Gaussian kernel density
estimator. Consequently, the "bulk" value is also currently ignored
since the bulk value is only used in computing the SPAM free energies
(which, as you pointed out here, is not yet implemented).

> Can I simply use datafile.out to calculate the whole energy
> myself (would it be in this case correct doing this by summing up all
> the columns and lines and dividing by frames and peaks numbers)?

Not quite -- you have to omit the frames in which a water site was
either unoccupied or doubly-occupied. The spam.info file prints out
this information, giving positive (unoccupied) and negative
(doubly-occupied) frame numbers for each water site that should be
omitted from the final calculation. Again, this file is something
designed for SPAM.py to parse and understand (and subsequently use when
computing SPAM free energies).

>
> - what [name <name>] in the help stands for?

This is the name of the data set that is generated. Cpptraj will use
this name when printing out data files to label the generated data sets.

> - what is changed in the calculations (or simply in the output?) if
> 'reorder' is included in the command?

This affects the trajectory file, actually. The original implementation
described in the paper used NAMD to compute the energies of each of the
waters in the water sites, but because water readily exchanges during
the simulation there is no way to know a priori which water molecule to
compute the interaction energy for for each site. The solution we chose
was to use cpptraj to exchange water indexes so that the _same_ water
molecule was always present in the same site (specifically, the first
water site was always occupied by the first water, second site by the
second water, etc.). This made it incredibly simple to use NAMD to
compute all of the energies for site 1 with a simple loop over the
reordered trajectory. If you visualize the reordered trajectory, you
should be able to see that every frame has the first water molecule in
virtually the same spot (except for those frames which are not computed
because the site is not occupied by a single water). Ultimately when
cpptraj computes the interaction energies the reordering is unnecessary,
but there was still interest in keeping this functionality around so we
can fall back on the original implementation outlined in the paper.

SPAM will be significantly easier to run when it is fully implemented in
cpptraj (although I plan on keeping SPAM.py around and updating it to be
fully compatible with the cleaner implementation to provide a helpful
front-end to SPAM calculations).

You may find some of the Python code in SPAM.py (and the spam package)
generally useful if you are familiar with Python and plan to write a
script to compute the SPAM free energies based on the cpptraj output.
To aid you in using the Python code I wrote there, I provided a link to
doxygen-generated HTML docs for the package
[https://www.dropbox.com/s/ur28i346bvlu0s2/spamdocs.tgz].

Guanglei and I are quite curious to see if others find the method
useful, so any feedback would be appreciated.

Thanks!
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Jan 24 2014 - 06:00:04 PST
Custom Search