Re: [AMBER] handling long trajs for PCA and components plot

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Mon, 12 May 2014 11:28:25 -0600

Hi,

Unfortunately the methods in the referenced paper do not describe how
they made this calculation. My guess (for Figure 3B etc) is that they
only projected the coordinates of atoms for a given residue along the
corresponding portion of the eigenvector for each frame, then took the
average over all frames. Again, this is all speculation on my part
since it is not described in the paper at all - you are probably
better off contacting the authors directly and asking how they did it.

If this is indeed what they did, unfortunately CPPTRAJ does not break
down eigenvectors in this way. One thing you could do that would be
somewhat equivalent would be to generate pseudo-trajectories for
desired modes with 'modes trajout' analysis in cpptraj, then
subsequently calculate RMSF for each residue in the pseudo
trajectories.

Good luck,

-Dan

On Sun, May 11, 2014 at 1:10 PM, newamber list <newamberlist.gmail.com> wrote:
> Dear Daniel
>
>>> Each of the dots in the principal component projection plots in Figs 3/4
> represent the PC projection value for PCs 1 and 2 for a single frame.
>
> Sorry if I did not explain my problem correctly. I have problem, rather
> could not understand, in obtaining a plot for EV1 vs Residue/Atom
> displacement (Figure 3B) and what should be the matrix dimension
> calculation/multiplication. (The thing is EV1 has 3N dimension and on x
> axis I have just N atoms somehow averaged on F frames)
>
> Thanks again
>
>
>
>
> On Thu, May 8, 2014 at 4:04 PM, Daniel Roe <daniel.r.roe.gmail.com> wrote:
>
>> Hi,
>>
>> The projection of a coordinate frame along a certain eigenvector
>> essentially tells you how much that frame contributes to that
>> particular principal component. Each of the dots in the principal
>> component projection plots in Figs 3/4 represent the PC projection
>> value for PCs 1 and 2 for a single frame. In each frame you have N
>> atoms, but 3N coordinates, which matches the dimensions of the
>> covariance matrix.
>>
>> Also note that you have to be careful when comparing eigenvectors
>> obtained from covariance matrices of different trajectories, as they
>> are not guaranteed to be the same. I didn't see any discussion of this
>> in my (admittedly brief) glance at the referenced paper. You can
>> either check if the eigenvectors are aligned manually, or generate a
>> covariance matrix from the combined trajectories, then calculate your
>> projections for each trajectory separately.
>>
>> Hope this helps,
>>
>> -Dan
>>
>> On Wed, May 7, 2014 at 4:27 PM, newamber list <newamberlist.gmail.com>
>> wrote:
>> > Hi
>> >
>> >>> If you are referring to figures 3/4 in the above reference
>> >
>> > Yes, e.g. Fig 3A and 3B. I can get 3A as:
>> >
>> > crdaction crd1 projection modes evecs.dat out myproj.txt beg 1 end 2
>> > :1-328.N. I have some confusion about calculation. Once you get
>> > Eigenvectors for Covar Matrix of 3Nx3N dimension then you calculate new
>> > data as e.g. just for EV1 and EV2 as:
>> >
>> > #Fx2 #2x3N
>> >
>> > |(xi1-Xi) (xk-Xk)| |EV1 in Row|
>> > : : | |EV2 in Row|
>> > : : |
>> > |(xiF-Xi) (xiF-xk)|
>> >
>> > where x represents coordinate and X average. Does xi and xk would
>> > necessarily mean the first and second dimension only (which happens to be
>> > first two coordinates?) I know its wrong how xi and xj is chosen?
>> >
>> > Still for Figure 3B I don't understand how its calculated as Eigenvector
>> > has a dimension of 3N but I have NxF atoms (where N is number of atoms
>> and
>> > F is Number of frames)
>> >
>> > thanks
>> > Jiom
>> >
>> >
>> >
>> >
>> > On Wed, May 7, 2014 at 3:52 PM, Daniel Roe <daniel.r.roe.gmail.com>
>> wrote:
>> >
>> >> On Tue, May 6, 2014 at 10:21 PM, newamber list <newamberlist.gmail.com>
>> >> wrote:
>> >> > Its working now but I am not sure if this is how people
>> calculate/present
>> >> > their results. Usually in papers they show displacements along one
>> >> > eigenvector (e.g.
>> >> >
>> >>
>> http://pubs.rsc.org/en/Content/ArticleLanding/2014/MB/c4mb00177j#!divAbstract
>> >>
>> >> If you are referring to figures 3/4 in the above reference, then what
>> >> you want to calculate is the projection of your coordinates along
>> >> certain eigenvectors - see the 'projection' command in CPPTRAJ. The
>> >> 'modes displ' just calculates total XYZ displacement for all
>> >> eigenvectors.
>> >>
>> >> -Dan
>> >>
>> >> > )
>> >> >
>> >> > I get displX, Y and Z but I was expecting some net displacement which
>> >> > simply I think should be sqrt(x^2+y^+z^2) and also for each individual
>> >> > eigenvector I should do something like this:
>> >> >
>> >> > modes name evecs.dat displ out dis1.dat beg 1 end 1
>> >> > modes name evecs.dat displ out dis2.dat beg 2 end 2
>> >> >
>> >> > 2)
>> >> >>> If the only goal is to get two structures that look the least like
>> each
>> >> > other, one way is to calculate the 2D RMSD
>> >> >
>> >> > This is nice suggestion indeed. I was following some previous amber
>> >> archive
>> >> > mails to generate procupine plots which needs two extreme pdbs in
>> >> > discussion it was not mentioned how one can get two extremes.
>> >> >
>> >> > best regards
>> >> >
>> >> >
>> >> > On Wed, May 7, 2014 at 4:27 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> >> wrote:
>> >> >
>> >> >> On Tue, May 6, 2014 at 8:13 PM, newamber list <
>> newamberlist.gmail.com>
>> >> >> wrote:
>> >> >> > runanalysis diagmatrix myCovar out evecs.dat vecs 15
>> >> >> > readdata evecs.dat
>> >> >> > modes displ out dis.dat name myCovar beg 1 end 3 :1-328.N
>> >> >> ###
>> >> >>
>> >> >> The problem here is that you are trying to use 'myCovar' (which is a
>> >> >> matrix) as your modes data set, which by default will be called
>> >> >> 'evecs.dat' since you did not supply a name. Try using 'modes name
>> >> >> evecs.dat ...'.
>> >> >>
>> >> >> > 2) Also how one can obtain two extreme configurations of system for
>> >> whole
>> >> >> > traj? Just by using lowest and highest rmsd with respect to average
>> >> >> > structure or something one can do with PCA?
>> >> >>
>> >> >> If the only goal is to get two structures that look the least like
>> >> >> each other, one way is to calculate the 2D RMSD and find the pair of
>> >> >> frames with the largest RMSD. You will probably want to use the
>> >> >> 'nosquare2d' option with standard data output to make searching
>> >> >> easier, e.g.
>> >> >>
>> >> >> 2drms .CA out rms2d.dat
>> >> >> datafile rms2d.dat nosquare2d
>> >> >>
>> >> >> Hope this helps,
>> >> >>
>> >> >> -Dan
>> >> >>
>> >> >> >
>> >> >> > thanks
>> >> >> > JIom
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Tue, May 6, 2014 at 6:24 PM, Daniel Roe <daniel.r.roe.gmail.com
>> >
>> >> >> wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> On Mon, May 5, 2014 at 5:23 PM, newamber list <
>> >> newamberlist.gmail.com>
>> >> >> >> wrote:
>> >> >> >> > 2) Also in many papers there are Principal component plots. If
>> am
>> >> not
>> >> >> >> > wrong these are same obtained from 'projection modes' as shown
>> >> below
>> >> >> >> > (myproj.txt). Or its something else?
>> >> >> >>
>> >> >> >> The plots you refer to are actually histograms generated using the
>> >> >> >> 'hist' command of the principal component projections obtained
>> from
>> >> >> >> the 'projection' command, see the bottom of the script in
>> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> http://pubs.acs.org/doi/suppl/10.1021/ct400862k/suppl_file/ct400862k_si_001.pdf
>> >> >> >> .
>> >> >> >>
>> >> >> >> > trajin 1trj.nc
>> >> >> >> > trajin 2trj.nc
>> >> >> >> > rms first :1-328.N
>> >> >> >> > average Avg.rst7 ncrestart
>> >> >> >> > createcrd crd1
>> >> >> >> > run
>> >> >> >> >
>> >> >> >> > reference Avg.rst7.1 [avg]
>> >> >> >> > crdaction crd1 rms ref [avg] :1-328.N
>> >> >> >> > crdaction crd1 matrix covar :1-328.N name myCovar
>> >> >> >> > runanalysis diagmatrix myCovar out evecs.dat vecs 15
>> >> >> >> >
>> >> >> >> > readdata evecs.dat
>> >> >> >> > crdaction crd1 projection modes evecs.dat out myproj.txt beg 1
>> end
>> >> 3
>> >> >> >> :1-328.N
>> >> >> >>
>> >> >> >> The last 'projection' command will create 3 data sets
>> (corresponding
>> >> >> >> to projections for PCs 1, 2, and 3) which you can then histogram,
>> >> >> >> either with cpptraj or your favorite program. It will be probably
>> be
>> >> >> >> easier to histogram within cpptraj if you name the 'projection'
>> data
>> >> >> >> sets, e.g.:
>> >> >> >>
>> >> >> >> crdaction crd1 projection P1 modes evecs.dat out myproj.txt beg 1
>> >> end 3
>> >> >> >> :1-328.N
>> >> >> >> runanalysis hist P1:1,*,*,*,100 out pca.hist.agr norm name P1-1
>> >> >> >>
>> >> >> >> etc. See the manual for more details on the 'hist' command (or
>> >> >> >> alternatively the 'kde' command to use a kernel density estimator
>> >> >> >> instead).
>> >> >> >>
>> >> >> >> Hope this helps,
>> >> >> >>
>> >> >> >> -Dan
>> >> >> >>
>> >> >> >> >
>> >> >> >> > Thanks
>> >> >> >> > _______________________________________________
>> >> >> >> > AMBER mailing list
>> >> >> >> > AMBER.ambermd.org
>> >> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> -------------------------
>> >> >> >> Daniel R. Roe, PhD
>> >> >> >> Department of Medicinal Chemistry
>> >> >> >> University of Utah
>> >> >> >> 30 South 2000 East, Room 201
>> >> >> >> Salt Lake City, UT 84112-5820
>> >> >> >> http://home.chpc.utah.edu/~cheatham/
>> >> >> >> (801) 587-9652
>> >> >> >> (801) 585-6208 (Fax)
>> >> >> >>
>> >> >> >> _______________________________________________
>> >> >> >> AMBER mailing list
>> >> >> >> AMBER.ambermd.org
>> >> >> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >> >> >>
>> >> >> > _______________________________________________
>> >> >> > AMBER mailing list
>> >> >> > AMBER.ambermd.org
>> >> >> > http://lists.ambermd.org/mailman/listinfo/amber
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> -------------------------
>> >> >> Daniel R. Roe, PhD
>> >> >> Department of Medicinal Chemistry
>> >> >> University of Utah
>> >> >> 30 South 2000 East, Room 201
>> >> >> Salt Lake City, UT 84112-5820
>> >> >> http://home.chpc.utah.edu/~cheatham/
>> >> >> (801) 587-9652
>> >> >> (801) 585-6208 (Fax)
>> >> >>
>> >> >> _______________________________________________
>> >> >> AMBER mailing list
>> >> >> AMBER.ambermd.org
>> >> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >> >>
>> >> > _______________________________________________
>> >> > AMBER mailing list
>> >> > AMBER.ambermd.org
>> >> > http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >>
>> >>
>> >> --
>> >> -------------------------
>> >> Daniel R. Roe, PhD
>> >> Department of Medicinal Chemistry
>> >> University of Utah
>> >> 30 South 2000 East, Room 201
>> >> Salt Lake City, UT 84112-5820
>> >> http://home.chpc.utah.edu/~cheatham/
>> >> (801) 587-9652
>> >> (801) 585-6208 (Fax)
>> >>
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 201
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 201
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Mon May 12 2014 - 11:00:04 PDT
Custom Search