Hi,
That tutorial doesn't have the input correct for performing principal
component analysis (which is the intent I think). It's trying to do
everything at once. Specifically, the input is calculating the
covariance matrix ('matrix') and also project the coordinates on
eigenvectors from diagonalizing that matrix in the same pass. You
can't project until you get eigenvectors, and you can't get that until
you diagonalize the matrix, and you can't get the matrix until you run
a first pass!
If you want to do PCA, this is a better tutorial:
https://amberhub.chpc.utah.edu/introduction-to-principal-component-analysis/
Briefly, the steps for PCA are:
1) Create a covariance matrix of the coordinates of interest. You
typically want to RMS-fit to something prior to this in order to
remove global rotation/translation from your matrix.
2) Diagonalize the matrix to get eigenvalues/eigenvectors (sometimes
called 'modes' in cpptraj - maybe not the best name choice, but just
so you're aware).
3) Project the *exact same coordinates as in step 1* (i.e. if you
rms-fit the coordinates to generate the matrix, you need to rms fit
before projecting, and the atom selection must be the same) on the
eigenvectors from step 2.
4) Have fun with the resulting data.
Steps 1 and 2 are typically done in 1 pass, and 3 in a second pass.
Hope this helps,
-Dan
On Wed, Sep 15, 2021 at 8:33 AM Michael Shokhen
<michael.shokhen.biu.ac.il> wrote:
>
> Dear AMBER experts,
>
> Following the AMBER 20 tutorial “6.3 Using Accelerated Molecular Dynamics (aMD) to enhance sampling” I have conducted production MD simulation of my protein. Unfortunately, I have faced problems attempting to analyze the MD simulation results by the cpptraj input file presented in the chapter 3) Analyzing the aMD results
> https://ambermd.org/tutorials/advanced/tutorial22/section3.php
> of the tutorial.
>
> The tutorial used to produce the PCA vectors from the aMD trajectory and further project this trajectories onto the produced PCA vectors.
>
>
>
>
>
> I have adopted the original cpptraj input to the number of residues 748 in my protein.
> It seems that the cpptraj input file in the tutorial contains obsolete commands.
>
> See Warning reports in the OUT file below.
>
>
>
> I need your help to resolve the problem.
>
>
>
> Thank you,
>
> Michael
>
>
>
> The command line to run the aMD_post_process_ptraj.in file through cpptraj is the following:
>
> cpptraj < aMD_post_process_ptraj.in > aMD_post_process_ptraj.out
>
> aMD_post_process_ptraj.in
>
> parm ../mc.prmtop
>
> trajin ../7_/md_prod_amd0.mdcrd 1 last 1
>
> reference ../6_/eq.rst
>
> center origin :1-748
>
> image origin center
>
> strip :WAT,Cl-,Na+
>
> rms reference mass :1-748.CA,C,N out aMD_0_RMSD.out
>
> matrix covar name matrixdat .CA out covmat-ca.dat
>
> analyze matrix matrixdat out evecs-ca.dat vecs 2244
>
> analyze matrix matrixdat name evecs-ca vecs 2244
>
> analyze modes fluct out analyzemodesflu.dat stack evecs-ca beg 1 end 2244
>
> analyze modes displ out analyzemodesdis.dat stack evecs-ca beg 1 end 2244
>
> projection modes evecs-ca.dat out pca12-ca beg 1 end 3 @CA
>
> go
>
>
>
>
>
>
>
>
>
> aMD_post_process_ptraj.out
>
> CPPTRAJ: Trajectory Analysis. V4.25.6 (AmberTools V20.11)
>
> ___ ___ ___ ___
>
> | \/ | \/ | \/ |
>
> _|_/\_|_/\_|_/\_|_
>
>
>
> | Date/time: 09/15/21 15:20:08
>
> | Available memory: 2.005 GB
>
>
>
> INPUT: Reading input from 'STDIN'
>
> [parm ../mc.prmtop]
>
> Reading '../mc.prmtop' as Amber Topology
>
> Radius Set: modified Bondi radii (mbondi)
>
> [trajin ../7_/md_prod_amd0.mdcrd 1 last 1]
>
> Reading '../7_/md_prod_amd0.mdcrd' as Amber NetCDF
>
> [reference ../6_/eq.rst]
>
> Reading '../6_/eq.rst' as Amber NC Restart
>
> Setting active reference for distance-based masks: 'eq.rst'
>
> [center origin :1-748]
>
> CENTER: Centering coordinates using geometric center of atoms in mask (:1-748) to
>
> coordinate origin.
>
> [image origin center]
>
> IMAGE: By molecule to origin based on center of mass using all atoms
>
> [strip :WAT,Cl-,Na+]
>
> STRIP: Stripping atoms in mask [:WAT,Cl-,Na+]
>
> [rms reference mass :1-748.CA,C,N out aMD_0_RMSD.out]
>
> RMSD: (:1-748.CA,C,N), reference is "default_name" (:1-748.CA,C,N), mass-weighted.
>
> Best-fit RMSD will be calculated, coords will be rotated and translated.
>
> [matrix covar name matrixdat @CA out covmat-ca.dat]
>
> MATRIX: Calculating covariance matrix, output is by atom.
>
> Printing to file covmat-ca.dat
>
> Matrix data set is 'matrixdat'
>
> Start: 1 Stop: Final frame
>
> Mask1 is '.CA'
>
> [analyze matrix matrixdat out evecs-ca.dat vecs 2244]
>
> Warning: The 'analyze' prefix is no longer necessary and may be soon deprecated.
>
> Warning: NOTE: 'analyze matrix' is now 'diagmatrix'.
>
> Changed DataFile 'evecs-ca.dat' type to Evecs file for set Modes_00004
>
> DIAGMATRIX: Diagonalizing matrix matrixdat and writing modes to evecs-ca.dat
>
> Calculating 2244 eigenvectors.
>
> Storing modes with name: Modes_00004
>
> [analyze matrix matrixdat name evecs-ca vecs 2244]
>
> Warning: The 'analyze' prefix is no longer necessary and may be soon deprecated.
>
> Warning: NOTE: 'analyze matrix' is now 'diagmatrix'.
>
> DIAGMATRIX: Diagonalizing matrix matrixdat
>
> Calculating 2244 eigenvectors.
>
> Storing modes with name: evecs-ca
>
> [analyze modes fluct out analyzemodesflu.dat stack evecs-ca beg 1 end 2244]
>
> Warning: The 'analyze' prefix is no longer necessary and may be soon deprecated.
>
> Warning: To add an analysis command the the queue, only the command name needs
>
> Warning: to be specified, e.g. 'modes <args>'.
>
> Warning: Argument 'stack' is deprecated, use 'name <modes>' instead.
>
> ANALYZE MODES: Calculating rms fluctuations using modes from evecs-ca, modes 1 to 2244
>
> Results are written to 'analyzemodesflu.dat'
>
> Boltzmann statistics used.
>
> Eigenvectors associated with zero or negative eigenvalues will be skipped.
>
> [analyze modes displ out analyzemodesdis.dat stack evecs-ca beg 1 end 2244]
>
> Warning: The 'analyze' prefix is no longer necessary and may be soon deprecated.
>
> Warning: To add an analysis command the the queue, only the command name needs
>
> Warning: to be specified, e.g. 'modes <args>'.
>
> Warning: Argument 'stack' is deprecated, use 'name <modes>' instead.
>
> ANALYZE MODES: Calculating displacements using modes from evecs-ca, modes 1 to 2244
>
> Results are written to 'analyzemodesdis.dat'
>
> Boltzmann statistics used.
>
> Eigenvectors associated with zero or negative eigenvalues will be skipped.
>
> Factor for displacement: 1.000000
>
> [projection modes evecs-ca.dat out pca12-ca beg 1 end 3 @CA]
>
> Warning: Set 'RMSD_00002' contains no data.
>
> Warning: File 'aMD_0_RMSD.out' has no sets containing data.
>
> Warning: Set 'matrixdat' contains no data.
>
> Warning: File 'covmat-ca.dat' has no sets containing data.
>
> Warning: Set 'Modes_00004' contains no data.
>
> Warning: File 'evecs-ca.dat' has no sets containing data.
>
> Warning: Set 'FLUCT_00006[rmsX]' contains no data.
>
> Warning: Set 'FLUCT_00006[rmsY]' contains no data.
>
> Warning: Set 'FLUCT_00006[rmsZ]' contains no data.
>
> Warning: Set 'FLUCT_00006[rms]' contains no data.
>
> Warning: File 'analyzemodesflu.dat' has no sets containing data.
>
> Warning: Set 'DISPL_00010[displX]' contains no data.
>
> Warning: Set 'DISPL_00010[displY]' contains no data.
>
> Warning: Set 'DISPL_00010[displZ]' contains no data.
>
> Warning: File 'analyzemodesdis.dat' has no sets containing data.
>
> TIME: Total execution time: 0.4572 seconds.
>
>
>
>
>
>
>
> *****************************
> Michael Shokhen, PhD
> Associate Professor
> Department of Chemistry
> Bar Ilan University,
> Ramat Gan, 52900
> Israel
> email: shokhen.mail.biu.ac.il<https://webmail.biu.ac.il/owa/redir.aspx?C=a160ef9b9a6b4d06992402715d3ee465&URL=mailto%3ashokhen%40mail.biu.ac.il>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Sep 17 2021 - 10:30:03 PDT