Re: [AMBER] Principal component analysis with CPPTRAJ - PCA

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Wed, 5 Jul 2017 08:45:00 -0400

Hi,

You are attempting to histogram a dataset (GPU:1) that was never
created or previously read in, which is why cpptraj complains that it
is not found. Remove that histogram command and you will probably be
ok.

It seems like you may be directly applying the example in the tutorial
(in which two independent sets of trajectories are compared by
comparing the PC projection histograms projected on the set of
eigenvectors calculated from the combined trajectories) to your case,
but it appears you want to perform PC analysis on a single trajectory.
My advice is to do some brief reading on PCA of protein systems (e.g.
Amadei et. al. Proteins 1993 17:412-425, Balsera et. al. J Phys Chem
1996 100:2567-2572, van Aalten et. al. J. Comp. Chem. 1997 18:169-181,
etc.) and then make sure you understand what each command in your
cpptraj script is trying to do and why. If you have specific questions
about that I will be happy to answer them. Hope this helps,

-Dan


On Mon, Jul 3, 2017 at 12:45 PM, Marcelo Andrade Chagas
<andrade.mchagas.gmail.com> wrote:
> Dear
>
> I was able to evolve more world the command suggested.
>
> However, the following error appeared in the second step, as
> Shown below
>
> Can you tell me what I should change about it?
>
> Best regards
>
> Marcelo
>
> marcelo.marcelo:~/TRABALHO-ADOLFO/TRABALHO-I-ADOLFO/ARQUIVOS-DM-PROT-II/DM-5GOZ/ANALISE/TESTE-PCA/TESTE-PCA-II/PCA-TESTE-II/PCA-TESTE-III/TSTE-PCA-IV$
> cpptraj -i pca-cpu-gpu.cpptraj
>
> CPPTRAJ: Trajectory Analysis. V16.16
> ___ ___ ___ ___
> | \/ | \/ | \/ |
> _|_/\_|_/\_|_/\_|_
>
> | Date/time: 07/03/17 13:35:47
> | Available memory: 3.952 GB
>
> INPUT: Reading input from 'pca-cpu-gpu.cpptraj'
> [parm proteina.prmtop [cpu]]
> Reading 'proteina.prmtop' as Amber Topology
> [trajin proteina.mdcrd parm [cpu]]
> Reading 'proteina.mdcrd' as Amber NetCDF
> [rms first :1-262&!.H=]
> RMSD: (:1-262&!.H*), reference is first frame (:1-262&!.H*).
> Best-fit RMSD will be calculated, coords will be rotated and translated.
> [average crdset AVG]
> Setting active reference for distance-based masks: 'AVG'
> AVERAGE: Averaging over coordinates in mask [*]
> Start: 1 Stop: Final frame
> Saving averaged coords to set 'AVG'
> [createcrd proteina-trajectories]
> CREATECRD: Saving coordinates from Top proteina.prmtop to
> "proteina-trajectories"
> [run]
> ---------- RUN BEGIN -------------------------------------------------
>
> PARAMETER FILES (1 total):
> 0: [cpu] proteina.prmtop, 4086 atoms, 262 res, box: Orthogonal, 1 mol
>
> INPUT TRAJECTORIES (1 total):
> 0: 'proteina.mdcrd' is a NetCDF AMBER trajectory, Parm proteina.prmtop
> (Orthogonal box) (reading 534 of 534)
> Coordinate processing will occur on 534 frames.
>
> REFERENCE FRAMES (1 total):
> 0: AVG
> Active reference frame for distance-based masks is 'AVG'
>
> BEGIN TRAJECTORY PROCESSING:
> .....................................................
> ACTION SETUP FOR PARM 'proteina.prmtop' (3 actions):
> 0: [rms first :1-262&!.H=]
> Target mask: [:1-262&!.H*](2039)
> Reference mask: [:1-262&!.H*](2039)
> Warning: Coordinates are being rotated and box coordinates are present.
> Warning: Unit cell vectors are NOT rotated; imaging will not be possible
> Warning: after the RMS-fit is performed.
> 1: [average crdset AVG]
> Mask [*] corresponds to 4086 atoms.
> Averaging over 4086 atoms.
> 2: [createcrd proteina-trajectories]
> Estimated memory usage (534 frames): 26.196 MB
> ----- proteina.mdcrd (1-534, 1) -----
> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
>
> Read 534 frames and processed 534 frames.
> TIME: Avg. throughput= 6141.8844 frames / second.
>
> ACTION OUTPUT:
> AVERAGE: 534 frames, COORDS set 'AVG'
>
> DATASETS (2 total):
> RMSD_00001 "RMSD_00001" (double, rms), size is 534
> proteina-trajectories "proteina-trajectories" (coordinates), size is
> 534 (26.196 MB) Box Coords, 4086 atoms
>
> RUN TIMING:
> TIME: Init : 0.0000 s ( 0.05%)
> TIME: Trajectory Process : 0.0869 s ( 98.67%)
> TIME: Action Post : 0.0008 s ( 0.93%)
> TIME: Analysis : 0.0000 s ( 0.00%)
> TIME: Data File Write : 0.0001 s ( 0.06%)
> TIME: Other : 0.0003 s ( 0.00%)
> TIME: Run Total 0.0881 s
> ---------- RUN END ---------------------------------------------------
> [crdaction proteina-trajectories rms ref AVG :1-262&!.H=]
> Using set 'proteina-trajectories'
> ----- proteina-trajectories (1-534, 1) -----
> RMSD: (:1-262&!.H*), reference is "AVG" (:1-262&!.H*).
> Best-fit RMSD will be calculated, coords will be rotated and translated.
> Target mask: [:1-262&!.H*](2039)
> Reference mask: [:1-262&!.H*](2039)
> Warning: Coordinates are being rotated and box coordinates are present.
> Warning: Unit cell vectors are NOT rotated; imaging will not be possible
> Warning: after the RMS-fit is performed.
> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
> TIME: Total action execution time: 0.0443 seconds.
> [crdaction proteina-trajectories matrix covar name cpu-gpu-covar
> :1-262&!.H=]
> Using set 'proteina-trajectories'
> ----- proteina-trajectories (1-534, 1) -----
> MATRIX: Calculating covariance matrix, output is by atom.
> Matrix data set is 'cpu-gpu-covar'
> Start: 1 Stop: Final frame
> Mask1 is ':1-262&!.H*'
> Mask [:1-262&!.H*] corresponds to 2039 atoms.
> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
> TIME: Total action execution time: 12.7291 seconds.
> [runanalysis diagmatrix cpu-gpu-covar out cpu-gpu-evecs.dat vecs 3 name
> myEvecs nmwiz nmwizvecs 3 nmwizfile dna.nmd nmwizmask :1-262&!.H=]
> Mask [:1-262&!.H*] corresponds to 2039 atoms.
> nmwiz topology 2039 atoms, 262 res, box: Orthogonal, 0 mol
> Changed DataFile 'cpu-gpu-evecs.dat' type to Evecs file for set myEvecs
> DIAGMATRIX: Diagonalizing matrix cpu-gpu-covar and writing modes to
> cpu-gpu-evecs.dat
> Calculating 3 eigenvectors.
> Writing 3 modes to NMWiz file dna.nmd Storing modes with name:
> myEvecs
> Eigenmode calculation for 'cpu-gpu-covar'
> Warning: In matrix 'cpu-gpu-covar', # of frames 534 is less than # of
> columns 6117.
> Warning: The max # of non-zero eigenvalues will be 534
> Calculating eigenvectors and eigenvalues.
> Calculating first 3 eigenmodes.
> TIME: Total analysis execution time: 2.3395 seconds.
> [crdaction proteina-trajectories projection CPU modes myEvecs beg 1 end 3
> :1-262&!.H= crdframes 1,534]
> Using set 'proteina-trajectories'
> ----- proteina-trajectories (1-534, 1) -----
> PROJECTION: Calculating projection using eigenvectors 1 to 3 of myEvecs
> Start: 1 Stop: Final frame
> Atom Mask: [:1-262&!.H*]
> Mask [:1-262&!.H*] corresponds to 2039 atoms.
> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
> TIME: Total action execution time: 0.0168 seconds.
> [hist CPU:1 bins 100 out cpu-gpu-hist.agr norm name CPU-1]
> Hist: cpu-gpu-hist.agr: Set up for 1 dimensions using the following
> datasets:
> [ Mode1 ]
> norm: Sum over bins will be normalized to 1.0.
> [hist CPU:2 bins 100 out cpu-gpu-hist.agr norm name CPU-2]
> Hist: cpu-gpu-hist.agr: Set up for 1 dimensions using the following
> datasets:
> [ Mode2 ]
> norm: Sum over bins will be normalized to 1.0.
> [hist CPU:3 bins 100 out cpu-gpu-hist.agr norm name CPU-3]
> Hist: cpu-gpu-hist.agr: Set up for 1 dimensions using the following
> datasets:
> [ Mode3 ]
> norm: Sum over bins will be normalized to 1.0.
> [hist GPU:1 bins 100 out cpu-gpu-hist.agr norm name GPU-1]
> Warning: Data set 'GPU:1' not found.
> Error: Dataset GPU:1 not found.
> Error: Could not setup analysis [hist]
> 1 errors encountered reading input.
> TIME: Total execution time: 15.3043 seconds.
> Error: Error(s) occurred during execution.
>
>
>
> Marcelo Andrade Chagas, MSc
> (PhD student)
> Laboratório de Química Computacional e Modelagem Molecular - LQC-MM
> * http://lqcmm.qui.ufmg.br/
> Departamento de Química da Universidade Federal de Minas Gerais - UFMG
> Tel:(31)3409-5776
>
> 2017-07-03 10:36 GMT-03:00 Daniel Roe <daniel.r.roe.gmail.com>:
>
>> Hi,
>>
>> I think the issue here is that you are loading your topology on the
>> command line, but then also loading it in your script and giving it a
>> tag ([cpu]). Try just running:
>>
>> cpptraj -i pca-cpu-gpu.cpptraj
>>
>> Does that help?
>>
>> -Dan
>>
>>
>> On Sat, Jul 1, 2017 at 1:40 PM, Marcelo Andrade Chagas
>> <andrade.mchagas.gmail.com> wrote:
>> > Dear Community Users of AMBER,
>> >
>> > I need to do a PCA analysis.
>> >
>> > I got the following error in cpptraj.
>> >
>> > From what I noticed this is related to creating a file.
>> >
>> > This file is created from which output files
>> > Of simulation and how?
>> >
>> > Regards,
>> >
>> > Marcelo
>> >
>> > use tutorial
>> >
>> > http://www.amber.utah.edu/AMBER-workshop/London-2015/pca/
>> >
>> > marcelo.marcelo:~/TRABALHO-ADOLFO/TRABALHO-I-ADOLFO/
>> ARQUIVOS-DM-PROT-II/DM-5GOZ/ANALISE/TESTE-PCA/TESTE-PCA-
>> II/PCA-TESTE-II/PCA-TESTE-III$
>> > cpptraj -i pca-cpu-gpu.cpptraj -p proteina.prmtop
>> >
>> > CPPTRAJ: Trajectory Analysis. V16.16
>> > ___ ___ ___ ___
>> > | \/ | \/ | \/ |
>> > _|_/\_|_/\_|_/\_|_
>> >
>> > | Date/time: 07/01/17 14:34:04
>> > | Available memory: 3.810 GB
>> >
>> > Reading 'proteina.prmtop' as Amber Topology
>> > INPUT: Reading input from 'pca-cpu-gpu.cpptraj'
>> > [parm proteina.prmtop [cpu]]
>> > Reading 'proteina.prmtop' as Amber Topology
>> > [trajin proteina.mdcrd parm [cpu]]
>> > Reading 'proteina.mdcrd' as Amber NetCDF
>> > [rms first :1-262&!.H=]
>> > RMSD: (:1-262&!.H*), reference is first frame (:1-262&!.H*).
>> > Best-fit RMSD will be calculated, coords will be rotated and
>> translated.
>> > [average crdset AVG]
>> > Setting active reference for distance-based masks: 'AVG'
>> > AVERAGE: Averaging over coordinates in mask [*]
>> > Start: 1 Stop: Final frame
>> > Saving averaged coords to set 'AVG'
>> > [createcrd proteina-trajectories]
>> > CREATECRD: Saving coordinates from Top proteina.prmtop to
>> > "proteina-trajectories"
>> > [run]
>> > ---------- RUN BEGIN -------------------------------------------------
>> >
>> > PARAMETER FILES (2 total):
>> > 0: proteina.prmtop, 4086 atoms, 262 res, box: Orthogonal, 1 mol
>> > 1: [cpu] proteina.prmtop, 4086 atoms, 262 res, box: Orthogonal, 1 mol
>> >
>> > INPUT TRAJECTORIES (1 total):
>> > 0: 'proteina.mdcrd' is a NetCDF AMBER trajectory, Parm proteina.prmtop
>> > (Orthogonal box) (reading 534 of 534)
>> > Coordinate processing will occur on 534 frames.
>> >
>> > REFERENCE FRAMES (1 total):
>> > 0: AVG
>> > Active reference frame for distance-based masks is 'AVG'
>> >
>> > BEGIN TRAJECTORY PROCESSING:
>> > .....................................................
>> > ACTION SETUP FOR PARM 'proteina.prmtop' (3 actions):
>> > 0: [rms first :1-262&!.H=]
>> > Target mask: [:1-262&!.H*](2039)
>> > Reference mask: [:1-262&!.H*](2039)
>> > Warning: Coordinates are being rotated and box coordinates are present.
>> > Warning: Unit cell vectors are NOT rotated; imaging will not be possible
>> > Warning: after the RMS-fit is performed.
>> > 1: [average crdset AVG]
>> > Mask [*] corresponds to 4086 atoms.
>> > Averaging over 4086 atoms.
>> > 2: [createcrd proteina-trajectories]
>> > Error: # atoms in current topology (4086) != # atoms in coords set
>> > "proteina-trajectories" (0)
>> > Error: Setup failed for [createcrd]
>> > Warning: Could not set up actions for proteina.prmtop: skipping.
>> > Read 0 frames and processed 0 frames.
>> > TIME: Avg. throughput= 0.0000 frames / second.
>> >
>> > ACTION OUTPUT:
>> >
>> > DATASETS (2 total):
>> > RMSD_00002 "RMSD_00002" (double, rms), size is 0
>> > proteina-trajectories "proteina-trajectories" (coordinates), size is
>> 0
>> > (0.024 kB) 0 atoms
>> >
>> > RUN TIMING:
>> > TIME: Init : 0.0000 s ( 1.96%)
>> > TIME: Trajectory Process : 0.0012 s ( 75.27%)
>> > TIME: Action Post : 0.0000 s ( 0.00%)
>> > TIME: Analysis : 0.0000 s ( 0.00%)
>> > TIME: Data File Write : 0.0000 s ( 2.28%)
>> > TIME: Other : 0.0003 s ( 0.20%)
>> > TIME: Run Total 0.0016 s
>> > ---------- RUN END ---------------------------------------------------
>> > [crdaction proteina-trajectories rms ref AVG :1-262&!.H=]
>> > Using set 'proteina-trajectories'
>> > Error: trajectory contains no frames.
>> > 1 errors encountered reading input.
>> > TIME: Total execution time: 0.0959 seconds.
>> > Error: Error(s) occurred during execution.
>> >
>> > Marcelo Andrade Chagas, MSc
>> > (PhD student)
>> > Laboratório de Química Computacional e Modelagem Molecular - LQC-MM
>> > * http://lqcmm.qui.ufmg.br/
>> > Departamento de Química da Universidade Federal de Minas Gerais - UFMG
>> > Tel:(31)3409-5776
>> > _______________________________________________
>> > AMBER mailing list
>> > AMBER.ambermd.org
>> > http://lists.ambermd.org/mailman/listinfo/amber
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe
>> Laboratory of Computational Biology
>> National Institutes of Health, NHLBI
>> 5635 Fishers Ln, Rm T900
>> Rockville MD, 20852
>> https://www.lobos.nih.gov/lcb
>>
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe
Laboratory of Computational Biology
National Institutes of Health, NHLBI
5635 Fishers Ln, Rm T900
Rockville MD, 20852
https://www.lobos.nih.gov/lcb
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jul 05 2017 - 06:00:05 PDT
Custom Search