Re: [AMBER] PCA_representation_in_Cpptraj from Daniel Roe on 2015-03-10 (Amber Archive Mar 2015)

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 10 Mar 2015 08:46:46 -0600

Hi,

On Tue, Mar 10, 2015 at 5:37 AM, Juan Eiros Zamora
<j.eiros-zamora14.imperial.ac.uk> wrote:
> parm *.prmtop
> trajin *.nc
>
> rms first
> average avg.nc
> run

So far so good, however I just realized there is a potential issue
with the next part:

> reference avg.nc [ref1]
> matrix covar name matrixdat .CA out covmat-ca.dat
> diagmatrix matrixdat out evecs-ca.dat vecs 10
> run

You load in your reference but you don't actually fit to it, so you're
just using the original frames from 'trajin'. This means that
rotational and translational movement will be included in your
covariance matrix which may not be what you want (they will likely
dominate your first 6 PCs). If all you are interested in is internal
motion you need an 'rms' command fitting to the reference prior to the
matrix calculation.

> projection modes evecs-ca.dat out pca-ca.dat beg 1 end 2 .CA
> run

Again, this is going to use the *original* trajectory data (from
'trajin'), which is OK in this case since you didn't rms-fit before
calculating the covariance matrix. However, if you do rms-fit before
calculating the matrix you will also need to either add an 'rms'
command before 'projection', or save the rms-fit frames as a COORDS
data set with 'createcrd'. If memory is not an issue I prefer the
latter. I would also do everything in memory, i.e. don't write files
out and read them back in (for 'projection' and 'hist'). So I would
use a script like:

parm *.prmtop
trajin *.nc
# Fit to first frame, create average structure, save coordinates
rms first
average avg.nc
createcrd CRD1
run
# Load average as reference, fit saved coordinates to reference
reference avg.nc [ref1]
crdaction CRD1 rms .CA ref [ref1]
# Calculate covariance matrix from rms-fit coords
crdaction CRD1 matrix covar name matrixdat .CA out covmat-ca.dat
# Diagonalize matrix, save eigenvectors as MyEvecs
runanalysis diagmatrix matrixdat out evecs-ca.dat vecs 10 name MyEvecs
# Project fit and saved coordinates along eigenvectors
crdaction CRD1 projection PROJECT modes MyEvecs beg 1 end 2 .CA
# Create histogram
runanalysis hist PROJECT:1 PROJECT:2 free 300 out fhist-all.CA.gnu bins 400

> Additionally, I am not quite sure how the last line of code works. I do
> not understand how changing the default bin size affects the
> calculation, should I just leave it as default (erase it from my script)
> if I want to analyze classical MD trajectory?

Jason explained all of this quite well. I will only add that as an
alternative to 'hist' in one dimension, you can create 1D histograms
in cpptraj with a Gaussian kernel density estimator using the 'kde'
command.

Hope this helps,

-Dan

>
> Thanks in advance,
>
> Juan Eiros
>
>
> On 09/03/15 14:07, Daniel Roe wrote:
>> Hi,
>>
>> You will need to modify your original cpptraj script for generating
>> PCs somewhat.
>>
>> On Mon, Mar 9, 2015 at 3:14 AM, abdennour braka
>> <abdennour.braka.univ-orleans.fr> wrote:
>>> parm prot_OBC.top
>>> trajin 1_replic_traj.trj
>>> reference min.rst
>>> rms reference
>>>
>>> matrix covar name matrixdat .CA out covmat-ca.dat
>>> analyze matrix matrixdat out evecs-ca.dat vecs 174
>>> analyze matrix matrixdat name evecs-ca vecs 174
>> You don't need two 'analyze matrix' commands here, just one will do
>> (and 'diagmatrix' is the preferred name for this analysis now):
>>
>> diagmatrix matrixdat name evecs-ca out evecs-ca.dat vecs 174
>>
>>> analyze modes fluct out analyzemodesflu.dat name evecs-ca beg 1 end 174
>>> analyze modes displ out analyzemodesdis.dat name evecs-ca beg 1 end 174
>> Your 'run' command should come here, *before* you attempt your
>> projection calculation since at this point nothing has been
>> calculated. So the next two commands should be:
>>
>> run
>> projection evecs evecs-ca out pca12-ca.dat beg 1 end 3 .CA
>>
>> Does that make sense to you? Let me know if not.
>>
>> As far as creating a weighted histogram, cpptraj can do this as well
>> given that you have your boost energies in KT. Assuming you have them
>> in a file called e.g. amdboost.dat in a single column, you could read
>> that and your previously-calculated PC projections and calculate the
>> reweighted 2D histogram like so:
>>
>> readdata amdboost.dat name BOOST
>> readdata pca12-ca.dat name PC
>> runanalysis hist PC:2 PC:3 amdboost BOOST bins 200 free 300 out
>> hist.pca12-ca.gnu
>>
>> The '2' and '3' for PC are used since pca12-ca.dat has data for the
>> first and second PC projections in columns 2 and 3. You can use 'list
>> dataset' after 'readdata' to see currently loaded data. Here no number
>> is needed for BOOST since there is only 1 column of data. The '.gnu'
>> extension will write the resulting histogram in gnuplot-readable
>> format.
>>
>> Note that this kind of reweighting is very straightforward. There are
>> other ways to reweight aMD data out there that you may want to check
>> out, notably here:
>>
>> http://mccammon.ucsd.edu/computing/amdReweighting/
>>
>> Hope this helps,
>>
>> -Dan
>>
>> run
>>> ###########################################
>>> Thank you four any replies,
>>>
>>> Abdennour
>>> //
>>>
>>> //
>>> _______________________________________________
>>> AMBER mailing list
>>> AMBER.ambermd.org
>>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Mar 10 2015 - 08:00:04 PDT