Re: [AMBER] PCA analysis and histogram

From: Daniel Roe <>
Date: Mon, 14 Apr 2014 09:20:46 -0600


I assume what you want is to calculate histograms of the projection of
your coordinates along the first two PCs? If so, you just need to use
the 'projection' action followed by 'hist' analysis. This can be done
with the following commands:

rms first :1-18.CA
projection modes evecs-ca_10.dat start 1 stop 2 .CA myproj
hist myproj:1 bins 200 out PC.hist.agr
hist myproj:2 bins 200 out PC.hist.agr

Modify the histogram commands as needed (e.g. put both data sets in
the same command for a 2D histogram etc). Note that the coordinates
(and mask you use) for the 'projection' command should be the same as
when you generated the original covariance matrix, but the
'center'/'image' commands are not needed since the 'rms' command
changes the coordinates of interest anyway.

This entire procedure can actually be done in a single cpptraj run by
saving the coordinates as a COORDS data set and making a few passes
over it with 'crdaction' like so:

rms first :1-18.CA
average avgStruct.rst7
createcrd crd1 # Creates COORDS data set crd1
reference avgStruct.rst7.1
crdaction crd1 rms reference :1-18.CA
crdaction crd1 matrix out matrix.dat name mymatrix covar .CA
runanalysis diagmatrix mymatrix out evecs.dat vecs 10 name mymodes
crdaction crd1 projection modes evecs.dat start 1 stop 2 .CA myproj
hist myproj:1 bins 200 out PC.hist.agr
hist myproj:2 bins 200 out PC.hist.agr

Note that here I am RMS-fitting to an averaged structure, which should
do a better job of eliminating global rotations (I assume you have a
reason for making the fitting mask and matrix masks separate so I kept
that in my example). Note that in the upcoming release of cpptraj you
will actually be able to use the 'modes' data set generated by the
'diagmatrix' command directly with the 'projection' action rather than
having to read it back in.

Hope this helps,


On Mon, Apr 14, 2014 at 5:45 AM, Neha Gandhi <> wrote:
> Hi List,
> I have calculated PCA using following to output file containing 10 models.
> center origin :1-18
> # now image the whole system about the centered origin
> image origin center
> rms first :1-18.CA
> matrix covar name covmat out covmat-ca_10.dat .CA
> analyze matrix covmat out evecs-ca_10.dat vecs 10
> This seem like a trivial question. How do I read these models (PCA 1 to 2)
> to define datasets in the hist (histogram) command in cpptraj?
> Many thanks,
> Neha
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 201
Salt Lake City, UT 84112-5820
(801) 587-9652
(801) 585-6208 (Fax)
Received on Mon Apr 14 2014 - 08:30:04 PDT
