Re: [AMBER] handling long trajs for PCA and components plot

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Thu, 8 May 2014 09:04:14 -0600

Hi,

The projection of a coordinate frame along a certain eigenvector
essentially tells you how much that frame contributes to that
particular principal component. Each of the dots in the principal
component projection plots in Figs 3/4 represent the PC projection
value for PCs 1 and 2 for a single frame. In each frame you have N
atoms, but 3N coordinates, which matches the dimensions of the
covariance matrix.

Also note that you have to be careful when comparing eigenvectors
obtained from covariance matrices of different trajectories, as they
are not guaranteed to be the same. I didn't see any discussion of this
in my (admittedly brief) glance at the referenced paper. You can
either check if the eigenvectors are aligned manually, or generate a
covariance matrix from the combined trajectories, then calculate your
projections for each trajectory separately.

Hope this helps,

-Dan

On Wed, May 7, 2014 at 4:27 PM, newamber list <newamberlist.gmail.com> wrote:
> Hi
>
>>> If you are referring to figures 3/4 in the above reference
>
> Yes, e.g. Fig 3A and 3B. I can get 3A as:
>
> crdaction crd1 projection modes evecs.dat out myproj.txt beg 1 end 2
> :1-328.N. I have some confusion about calculation. Once you get
> Eigenvectors for Covar Matrix of 3Nx3N dimension then you calculate new
> data as e.g. just for EV1 and EV2 as:
>
> #Fx2 #2x3N
>
> |(xi1-Xi) (xk-Xk)| |EV1 in Row|
> : : | |EV2 in Row|
> : : |
> |(xiF-Xi) (xiF-xk)|
>
> where x represents coordinate and X average. Does xi and xk would
> necessarily mean the first and second dimension only (which happens to be
> first two coordinates?) I know its wrong how xi and xj is chosen?
>
> Still for Figure 3B I don't understand how its calculated as Eigenvector
> has a dimension of 3N but I have NxF atoms (where N is number of atoms and
> F is Number of frames)
>
> thanks
> Jiom
>
>
>
>
> On Wed, May 7, 2014 at 3:52 PM, Daniel Roe <daniel.r.roe.gmail.com> wrote:
>
>> On Tue, May 6, 2014 at 10:21 PM, newamber list <newamberlist.gmail.com>
>> wrote:
>> > Its working now but I am not sure if this is how people calculate/present
>> > their results. Usually in papers they show displacements along one
>> > eigenvector (e.g.
>> >
>> http://pubs.rsc.org/en/Content/ArticleLanding/2014/MB/c4mb00177j#!divAbstract
>>
>> If you are referring to figures 3/4 in the above reference, then what
>> you want to calculate is the projection of your coordinates along
>> certain eigenvectors - see the 'projection' command in CPPTRAJ. The
>> 'modes displ' just calculates total XYZ displacement for all
>> eigenvectors.
>>
>> -Dan
>>
>> > )
>> >
>> > I get displX, Y and Z but I was expecting some net displacement which
>> > simply I think should be sqrt(x^2+y^+z^2) and also for each individual
>> > eigenvector I should do something like this:
>> >
>> > modes name evecs.dat displ out dis1.dat beg 1 end 1
>> > modes name evecs.dat displ out dis2.dat beg 2 end 2
>> >
>> > 2)
>> >>> If the only goal is to get two structures that look the least like each
>> > other, one way is to calculate the 2D RMSD
>> >
>> > This is nice suggestion indeed. I was following some previous amber
>> archive
>> > mails to generate procupine plots which needs two extreme pdbs in
>> > discussion it was not mentioned how one can get two extremes.
>> >
>> > best regards
>> >
>> >
>> > On Wed, May 7, 2014 at 4:27 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> wrote:
>> >
>> >> On Tue, May 6, 2014 at 8:13 PM, newamber list <newamberlist.gmail.com>
>> >> wrote:
>> >> > runanalysis diagmatrix myCovar out evecs.dat vecs 15
>> >> > readdata evecs.dat
>> >> > modes displ out dis.dat name myCovar beg 1 end 3 :1-328.N
>> >> ###
>> >>
>> >> The problem here is that you are trying to use 'myCovar' (which is a
>> >> matrix) as your modes data set, which by default will be called
>> >> 'evecs.dat' since you did not supply a name. Try using 'modes name
>> >> evecs.dat ...'.
>> >>
>> >> > 2) Also how one can obtain two extreme configurations of system for
>> whole
>> >> > traj? Just by using lowest and highest rmsd with respect to average
>> >> > structure or something one can do with PCA?
>> >>
>> >> If the only goal is to get two structures that look the least like
>> >> each other, one way is to calculate the 2D RMSD and find the pair of
>> >> frames with the largest RMSD. You will probably want to use the
>> >> 'nosquare2d' option with standard data output to make searching
>> >> easier, e.g.
>> >>
>> >> 2drms .CA out rms2d.dat
>> >> datafile rms2d.dat nosquare2d
>> >>
>> >> Hope this helps,
>> >>
>> >> -Dan
>> >>
>> >> >
>> >> > thanks
>> >> > JIom
>> >> >
>> >> >
>> >> >
>> >> > On Tue, May 6, 2014 at 6:24 PM, Daniel Roe <daniel.r.roe.gmail.com>
>> >> wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> On Mon, May 5, 2014 at 5:23 PM, newamber list <
>> newamberlist.gmail.com>
>> >> >> wrote:
>> >> >> > 2) Also in many papers there are Principal component plots. If am
>> not
>> >> >> > wrong these are same obtained from 'projection modes' as shown
>> below
>> >> >> > (myproj.txt). Or its something else?
>> >> >>
>> >> >> The plots you refer to are actually histograms generated using the
>> >> >> 'hist' command of the principal component projections obtained from
>> >> >> the 'projection' command, see the bottom of the script in
>> >> >>
>> >> >>
>> >>
>> http://pubs.acs.org/doi/suppl/10.1021/ct400862k/suppl_file/ct400862k_si_001.pdf
>> >> >> .
>> >> >>
>> >> >> > trajin 1trj.nc
>> >> >> > trajin 2trj.nc
>> >> >> > rms first :1-328.N
>> >> >> > average Avg.rst7 ncrestart
>> >> >> > createcrd crd1
>> >> >> > run
>> >> >> >
>> >> >> > reference Avg.rst7.1 [avg]
>> >> >> > crdaction crd1 rms ref [avg] :1-328.N
>> >> >> > crdaction crd1 matrix covar :1-328.N name myCovar
>> >> >> > runanalysis diagmatrix myCovar out evecs.dat vecs 15
>> >> >> >
>> >> >> > readdata evecs.dat
>> >> >> > crdaction crd1 projection modes evecs.dat out myproj.txt beg 1 end
>> 3
>> >> >> :1-328.N
>> >> >>
>> >> >> The last 'projection' command will create 3 data sets (corresponding
>> >> >> to projections for PCs 1, 2, and 3) which you can then histogram,
>> >> >> either with cpptraj or your favorite program. It will be probably be
>> >> >> easier to histogram within cpptraj if you name the 'projection' data
>> >> >> sets, e.g.:
>> >> >>
>> >> >> crdaction crd1 projection P1 modes evecs.dat out myproj.txt beg 1
>> end 3
>> >> >> :1-328.N
>> >> >> runanalysis hist P1:1,*,*,*,100 out pca.hist.agr norm name P1-1
>> >> >>
>> >> >> etc. See the manual for more details on the 'hist' command (or
>> >> >> alternatively the 'kde' command to use a kernel density estimator
>> >> >> instead).
>> >> >>
>> >> >> Hope this helps,
>> >> >>
>> >> >> -Dan
>> >> >>
>> >> >> >
>> >> >> > Thanks
>> >> >> > _______________________________________________
>> >> >> > AMBER mailing list
>> >> >> > AMBER.ambermd.org
>> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> -------------------------
>> >> >> Daniel R. Roe, PhD
>> >> >> Department of Medicinal Chemistry
>> >> >> University of Utah
>> >> >> 30 South 2000 East, Room 201
>> >> >> Salt Lake City, UT 84112-5820
>> >> >> http://home.chpc.utah.edu/~cheatham/
>> >> >> (801) 587-9652
>> >> >> (801) 585-6208 (Fax)
>> >> >>
>> >> >> _______________________________________________
>> >> >> AMBER mailing list
>> >> >> AMBER.ambermd.org
>> >> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >> >>
>> >> > _______________________________________________
>> >> > AMBER mailing list
>> >> > AMBER.ambermd.org
>> >> > http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >>
>> >>
>> >> --
>> >> -------------------------
>> >> Daniel R. Roe, PhD
>> >> Department of Medicinal Chemistry
>> >> University of Utah
>> >> 30 South 2000 East, Room 201
>> >> Salt Lake City, UT 84112-5820
>> >> http://home.chpc.utah.edu/~cheatham/
>> >> (801) 587-9652
>> >> (801) 585-6208 (Fax)
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 201
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 201
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu May 08 2014 - 08:30:03 PDT
Custom Search