Re: [AMBER] PCA eigenvalues

From: Marcelo Andrade Chagas <andrade.mchagas.gmail.com>
Date: Mon, 27 Nov 2017 13:52:09 -0200

Dear,

follows a scripts attachment and below is a brief explanation that I am
using to
PCA analyzes.

It might help to see your problem better if it is not yet resolved.

I believe it could be done as a single script, but I prefer
do by stages. The last step is done because I do not want to use the
file by histograms, so I use the evecs.dat file from this step
in the last step.

Best regards

___________________________________________________________________________________________________________________________________________________________________________________________________________________________-

ETAPA I
Arquivo: media.cpptraj # in this file we take
the unique trajectory of the simulation and generate a pdb file of the
average of the trajectories

trajin proteina.mdcrd # cpptraj -i
media.cpptraj -p proteina.prmtop
autoimage
average media.pdb pdb nowrap

ETAPA II
Arquivo: tira.cpptraj # in this file we create a
new unique trajectory starting, for example in this case of structure 1 and
saving 1 frame every 60 frames of the whole simulation
                                                         # in this step we
used the reference media.pdb file created in the first step
trajin proteina.mdcrd 2500 last 60
reference media.pdb # cpptraj -i tira.cpptraj
-p proteina.prmtop
autoimage
trajout proteina-II.mdcrd netcdf

ETAPA III
Arquivo: pca-cpu-gpu.cpptraj # in this file we will
calculate the vectors for the PCA projection. We have to provide
information on the trajectory file, the mask on which type
                                                  # of atom will be
calculated -example .CA, on alpha carbon atoms

mudancas nas linhas:
trajin proteina-II.mdcrd parm
rms first .CA # cpptraj -i
pca-cpu-gpu.cpptraj
crdaction proteina-trajectories rms ref AVG .CA
crdaction proteina-trajectories matrix covar \
  name cpu-gpu-covar .CA
nmwiz nmwizvecs 3 nmwizfile dna.nmd nmwizmask .CA
beg 1 end 3 .CA crdframes 1,2500 # In this line,
the number of frames 2500 comes from the unique trajectory file created
when specifying 2500 last 60 - at the command line of the


         # file 1.2500 means the first frame to the last of this new single
path, so this number

         # must be checked in the output file when running the file
tira.cpptraj
hist CPU:1 bins 2500 out cpu-gpu-hist.agr norm name CPU-1
hist CPU:2 bins 2500 out cpu-gpu-hist.agr norm name CPU-2 # 2500
here comes from the trajectories selected in the previous command

readdata cpu-gpu-evecs.dat name Evecs # this
cpu-gpu-evecs.dat file is that they have the vectors that will be used for
the PCA projection in the next step

pcmin -2500 pcmax 2500 tmode 1 trajoutmask .CA trajoutfmt netcdf # once
again, the number of frames and the mask used are specified, in the case on
the .CA

ETAPA IV
Arquivo: PCA.cpptraj # in this file we
will use the file cpu-gpu-evecs.dat created in the previous step in which
we will do the projection of the modes of the vectors

trajin proteina-II.mdcrd
projection modes cpu-gpu-evecs.dat out pca12-ca beg 1 end 3 .CA #
cpptraj -i PCA.cpptraj -p proteina.prmtop
go



Mudança na Etapa II # The change in
this step - 2500 last 10 - means that we analyze the RMSD of the protein
and verify that the system stabilizes after the
                                                                      the
first 10 ns of simulation of the production stage, so we discarded the
first 2500 corresponding frames.

Arquivo: tira.cpptraj

trajin proteina.mdcrd 2500 last 10
reference media.pdb
autoimage
trajout proteina-II.mdcrd netcdf



Marcelo Andrade Chagas, MSc
(PhD student)
Laboratório de Química Computacional e Modelagem Molecular - LQC-MM
* http://lqcmm.qui.ufmg.br/
Departamento de Química da Universidade Federal de Minas Gerais - UFMG
Tel:(31)3409-5776

2017-11-27 10:20 GMT-02:00 Nisha Amarnath Jonniya <phd1601271002.iiti.ac.in>
:

> Thanks Elvis for the reply.
> Will do and let you know.
>
> On Mon, Nov 27, 2017 at 5:35 PM, Elvis Martis <elvis.martis.bcp.edu.in>
> wrote:
>
> > Hi,
> > Still not clear what is wrong.
> > I am attaching a script for PCA. Just copy it in the correct directory
> and
> > run "cpptraj -i pca.in" <without quotes.>
> > Just tell me if you get the same error.
> >
> > Best Regards
> > Elvis Martis
> > Mumbai, INDIA.
> >
> > ________________________________________
> > From: Nisha Amarnath Jonniya <phd1601271002.iiti.ac.in>
> > Sent: 27 November 2017 14:15
> > To: AMBER Mailing List
> > Subject: Re: [AMBER] PCA eigenvalues
> >
> > Dear Elvis
> >
> > Though I tried with your comment, but still I got the following error
> > after execution of hist command .
> > Is it the error with parm.top file??
> >
> >
> > can you help me.
> >
> >
> >
> >
> > ANALYSIS: Performing 3 analyses:
> > 0: [hist ex:1 bins 100 out ex_hist.gnu norm name ex_1]
> > Error: HistBin: Max (-16.4714) must be greater than min (-16.4714)
> > Error: Could not set up histogram dimension 'ex:1'
> > Error: In Analysis [hist]
> > 1: [hist ex:2 bins 100 out ex_hist.gnu norm name ex_2]
> > Error: HistBin: Max (17.4431) must be greater than min (17.4431)
> > Error: Could not set up histogram dimension 'ex:2'
> > Error: In Analysis [hist]
> > 2: [hist ex:3 bins 100 out ex_hist.gnu norm name ex_3]
> > Error: HistBin: Max (-21.5487) must be greater than min (-21.5487)
> > Error: Could not set up histogram dimension 'ex:3'
> > Error: In Analysis [hist]
> >
> > TIME: Analyses took 0.0161 seconds.
> >
> > RUN TIMING:
> > TIME: Init : 0.0545 s ( 0.00%)
> > TIME: Trajectory Process : 2025.7628 s (100.00%)
> > TIME: Action Post : 0.0000 s ( 0.00%)
> > TIME: Analysis : 0.0161 s ( 0.00%)
> > TIME: Data File Write : 0.0000 s ( 0.00%)
> > TIME: Other : 0.0142 s ( 0.00%)
> > TIME: Run Total 2025.8477 s
> > ---------- RUN END ---------------------------------------------------
> > [readdata evececs name Evecs]
> > Reading 'evececs' as Evecs file with name 'Evecs'
> > Reading modes from evececs
> > Warning: DataSet 'Evecs' already present.
> > Error: reading datafile evececs
> > Error: Could not read data file 'evececs'.
> > 1 errors encountered reading input.
> > Warning: Set 'ex_1' contains no data.
> > Warning: Set 'ex_2' contains no data.
> > Warning: Set 'ex_3' contains no data.
> > Warning: File 'ex_hist.gnu' has no sets containing data.
> > TIME: Total execution time: 8669.3131 seconds.
> > Error: Error(s) occurred during execution.
> >
> >
> >
> >
> >
> > On Fri, Nov 24, 2017 at 5:45 PM, Elvis Martis <elvis.martis.bcp.edu.in>
> > wrote:
> >
> > > Hi,
> > > Here is the problem
> > > "runanalysis diagmatrix ex_covar out evececs 20 name
> exEvecs:1-273&!.H="
> > > Seems like you missed a space there.
> > > Try to replace that line with this
> > > "runanalysis diagmatrix ex_covar out evececs 20 name ex Evecs
> > :1-273&!.H="
> > >
> > >
> > > Best Regards
> > > Elvis Martis
> > > Mumbai, INDIA.
> > >
> > > ________________________________________
> > > From: Nisha Amarnath Jonniya <phd1601271002.iiti.ac.in>
> > > Sent: 24 November 2017 17:09
> > > To: AMBER Mailing List
> > > Subject: Re: [AMBER] PCA eigenvalues
> > >
> > > Dear all,
> > >
> > > As I was also doing PCA analysis. This mail helped me too but I
> got
> > > error in execution of hist: command .
> > >
> > > These are my script.
> > >
> > > parm complex_solv.prmtop
> > > trajin trajout10-51.mdcrd
> > > rms first :1-273&!.H=
> > > average crdset ex_average
> > > createcrd ex_trajectories
> > > run
> > > crdaction ex_trajectories rms ref ex_average :1-273&!.H=
> > > crdaction ex_trajectories matrix covar name ex_covar :1-273&!.H=
> > > runanalysis diagmatrix ex_covar out evececs 20 name exEvecs:1-273&!.H=
> > > readdata evececs name Evecs
> > > runanalysis modes eigenval name Evecs out evalfrac.dat
> > >
> > > crdaction ex_trajectories projection ex modes Evecs beg 1 end 3
> > > :1-273&!.H= crdframes 84000
> > >
> > > hist ex:1 bins 100 out 5drb_hist.gnu norm name ex_1
> > > hist ex:2 bins 100 out 5drb_hist.gnu norm name ex_2
> > > hist ex:3 bins 100 out 5drb_hist.gnu norm name ex_3
> > > run
> > > readdata evececs name Evecs
> > > parm complex_solv.prmtop
> > > parmstrip !(:1-273&!.H=)
> > > parmwrite out ex_modes.prmtop
> > > runanalysis modes name Evecs trajout ex_mode1.nc pcmin -100 pcmax 100
> > > tmode
> > > 1 trajoutmask :1-273&!.H= trajoutfmt netcdf
> > > runanalysis modes name Evecs trajout ex_mode2.nc pcmin -100 pcmax 100
> > > tmode
> > > 2 trajoutmask :1-273&!.H= trajoutfmt netcdf
> > > runanalysis modes name Evecs trajout ex_mode3.nc pcmin -100 pcmax 100
> > > tmode
> > > 3 trajoutmask :1-273&!.H= trajoutfmt netcdf
> > >
> > >
> > >
> > > I am getting error in hist command with following error
> > >
> > >
> > > Data set 'ex:1' not found.
> > > Error: Dataset ex:1 not found.
> > > Error: Could not setup analysis [hist]
> > > 1 errors encountered reading input.
> > > Warning: File 'ex_hist.gnu' has no sets containing data.
> > >
> > >
> > >
> > >
> > > Can you please guide me further how to rectify this problem.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Nov 24, 2017 at 12:26 AM, Lizelle Lubbe <LBBLIZ002.myuct.ac.za
> >
> > > wrote:
> > >
> > > > Hi Dan,
> > > >
> > > > Thanks so much for the great advice you sent earlier.
> > > > It worked out well but I noticed the following warning in the output:
> > > > Warning: Frame 25 Coordinates out of bounds (100). What does it mean
> > and
> > > > should I be worried about it?
> > > >
> > > > The projection of PC1 vs PC2 was written as a gnu file like you
> > suggested
> > > > and displays a scatter of frames. However, the plot has a black
> > > background
> > > > and purple dots for each frame which is really not very clear.
> > > > Is there any way that I can change the colour or write it in another
> > > > format? I can't seem to edit the plot at all in gnuplot.
> > > >
> > > > Regards
> > > >
> > > > Lizelle
> > > >
> > > > ________________________________________
> > > > From: Daniel Roe <daniel.r.roe.gmail.com>
> > > > Sent: 20 November 2017 05:53:21 PM
> > > > To: AMBER Mailing List
> > > > Subject: Re: [AMBER] PCA eigenvalues
> > > >
> > > > s.dat vHi,
> > > >
> > > > On Sat, Nov 18, 2017 at 10:05 AM, Lizelle Lubbe <
> LBBLIZ002.myuct.ac.za
> > >
> > > > wrote:
> > > > > Hi Amber users,
> > > > >
> > > > > I have performed PCA analysis on my MD trajectories in cpptraj and
> > > would
> > > > like to plot the fraction of total variance vs eigenvalue as shown in
> > > > figure s9 of Roe et al 2014 (j phys chem b, 118, 3543-3552).
> > > >
> > > > To do this you'll want to calculate all principal components (i.e.
> > > > eigenvectors) from your covariance matrix, so 3 * the number of
> > > > selected atoms. Then you run the 'modes' analysis command with the
> > > > 'eigenval' keyword on the resulting modes data, e.g.
> > > >
> > > > runanalysis diagmatrix myCovar out evececs 246 name myEvecs :1-4&!.H=
> > > > runanalysis modes eigenval name myEvecs out evalfrac.dat
> > > >
> > > > > I also want to have a scatter plot of pc1 vs pc2 and pc2 vs pc3 to
> > > > visualize clusters in this space.
> > > > > Could someone please guide me in how this can be accomplished?
> > > >
> > > > For these, just give desired the principal component projection data
> > > > set names to the 'hist' command, e.g. for the projection of PC 1 vs
> 2:
> > > >
> > > > hist Ndom:1 Ndom:2 bins 100 out Ndom_hist.gnu norm name Ndom_1_2
> > > >
> > > > > Also, how should the pca histogram analysis be interpreted?
> > > >
> > > > This is a lot trickier. As you probably know principal components
> > > > represent axes that in this case describe variance in the selected
> > > > atoms in your system. So PCs can tell you about what the dominant
> > > > modes of motion in your system look like, but generally may not tell
> > > > you how your system actually moves. Personally I like to visualize
> the
> > > > modes of motion with the 'trajout' keyword of 'modes' (which it
> > > > appears you are doing). It can also help to look at the so-called
> > > > 'porcupine plots' - I use the nmwiz plugin for VMD (it appears you
> are
> > > > doing that as well).
> > > >
> > > > Hope this helps,
> > > >
> > > > -Dan
> > > >
> > > > >
> > > > > This is my pca script:
> > > > >
> > > > > trajin traj.nc
> > > > >
> > > > > rms first :1-720&!.H=
> > > > >
> > > > > average crdset Ndom_average
> > > > >
> > > > > createcrd Ndom_trajectories
> > > > >
> > > > > run
> > > > >
> > > > > crdaction Ndom_trajectories rms ref Ndom_average :1-720&!.H=
> > > > >
> > > > > crdaction Ndom_trajectories matrix covar name Ndom_covar
> :1-720&!.H=
> > > > >
> > > > > runanalysis diagmatrix Ndom_covar out Ndom_evecs.dat vecs 20 name
> > > > NdomEvecs nmwiz nmwizvecs 3 nmwizfile Ndom_pca.nmd nmwizmask
> > :1-720&!.H=
> > > > >
> > > > > crdaction Ndom_trajectories projection Ndom modes NdomEvecs beg 1
> > end 3
> > > > :1-720&!.H= crdframes 1,3000
> > > > >
> > > > > hist Ndom:1 bins 100 out Ndom_hist.agr norm name Ndom_1
> > > > >
> > > > > hist Ndom:2 bins 100 out Ndom_hist.agr norm name Ndom_2
> > > > >
> > > > > hist Ndom:3 bins 100 out Ndom_hist.agr norm name Ndom_3
> > > > >
> > > > > run
> > > > >
> > > > > clear all
> > > > >
> > > > > readdata Ndom_evecs.dat name Evecs
> > > > >
> > > > > parm NDOM_CPLX_rnb_stripped_box.prmtop
> > > > >
> > > > > parmstrip !(:1-720&!.H=)
> > > > >
> > > > > parmwrite out Ndom_modes.prmtop
> > > > >
> > > > > runanalysis modes name Evecs trajout Ndom_mode1.nc pcmin -100 pcmax
> > 100
> > > > tmode 1 trajoutmask :1-720&!.H= trajoutfmt netcdf
> > > > >
> > > > > runanalysis modes name Evecs trajout Ndom_mode2.nc pcmin -100 pcmax
> > 100
> > > > tmode 2 trajoutmask :1-720&!.H= trajoutfmt netcdf
> > > > >
> > > > > runanalysis modes name Evecs trajout Ndom_mode3.nc pcmin -100 pcmax
> > 100
> > > > tmode 3 trajoutmask :1-720&!.H= trajoutfmt netcdf
> > > > >
> > > > > Kind regards
> > > > >
> > > > > Lizelle Lubbe
> > > > > PhD Chemical Biology candidate
> > > > > University of Cape Town
> > > > > Disclaimer - University of Cape Town This e-mail is subject to UCT
> > > > policies and e-mail disclaimer published on our website at
> > > > http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable
> > from
> > > > +27 21 650 9111. If this e-mail is not related to the business of
> UCT,
> > it
> > > > is sent by the sender in an individual capacity. Please report
> security
> > > > incidents or abuse via csirt.uct.ac.za
> > > > > _______________________________________________
> > > > > AMBER mailing list
> > > > > AMBER.ambermd.org
> > > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > >
> > > >
> > > >
> > > > --
> > > > -------------------------
> > > > Daniel R. Roe
> > > > Laboratory of Computational Biology
> > > > National Institutes of Health, NHLBI
> > > > 5635 Fishers Ln, Rm T900
> > > > Rockville MD, 20852
> > > > https://www.lobos.nih.gov/lcb
> > > >
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > > Disclaimer - University of Cape Town This e-mail is subject to UCT
> > > > policies and e-mail disclaimer published on our website at
> > > > http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable
> > from
> > > > +27 21 650 9111. If this e-mail is not related to the business of
> UCT,
> > it
> > > > is sent by the sender in an individual capacity. Please report
> security
> > > > incidents or abuse via csirt.uct.ac.za
> > > >
> > > > _______________________________________________
> > > > AMBER mailing list
> > > > AMBER.ambermd.org
> > > > http://lists.ambermd.org/mailman/listinfo/amber
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Nisha Amarnath Jonniya
> > > PhD Research Scholar
> > > Biosciences and Biomedical Engineering
> > > Indian Institute of Technology, Indore
> > > India
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> > > _______________________________________________
> > > AMBER mailing list
> > > AMBER.ambermd.org
> > > http://lists.ambermd.org/mailman/listinfo/amber
> > >
> >
> >
> >
> > --
> >
> > Nisha Amarnath Jonniya
> > PhD Research Scholar
> > Biosciences and Biomedical Engineering
> > Indian Institute of Technology, Indore
> > India
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> >
>
>
> --
>
> Nisha Amarnath Jonniya
> PhD Research Scholar
> Biosciences and Biomedical Engineering
> Indian Institute of Technology, Indore
> India
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>


_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Mon Nov 27 2017 - 08:00:03 PST
Custom Search