Re: [AMBER] Running PCA

From: Nicole Ippolito <ippolito.nicole89.gmail.com>
Date: Wed, 30 Jul 2014 08:46:55 -0500

So I thought it was working but it keeps crashing asking for a readdata
command for the evecs.dat file. I entered it but it did not change the
error. If I do need to enter this command where should I place it?

Here is the trajin file.

trajin ~/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz
# Fit to first frame for averaging
rms first .CA
# Average the trajectory
average V600E505H_sor_aMD_avg_pdb.pdb pdb
# Save the rms-fit coords
createcrd crd1
# First run
run
# Load the generated average as reference
reference V600E505H_sor_aMD_avg_pdb.pdb
# Fit saved coords to the reference
crdaction crd1 rms reference .CA
# Calculate covariance matrix from fit coords
crdaction crd1 matrix out matrix.dat name mymatrix covar .CA
# Diagonalize matrix to get eigenvectors/eigenvalues
runanalysis diagmatrix mymatrix out evecs.dat vecs 20 name mymode
# Project fit coords along eigenvectors 1-3
crdaction crd1 projection mymodes evecs.dat out projection1-3.out start 1
stop 3 .CA myproj
# Set up histograms for PC projections
hist myproj:1 bins 200 out PC.hist.agr
hist myproj:2 bins 200 out PC.hist.agr
hist myproj:3 bins 200 out PC.hist.agr
# Run analyses
runanalysis






CPPTRAJ: Trajectory Analysis. V14.01
    ___ ___ ___ ___
     | \/ | \/ | \/ |
    _|_/\_|_/\_|_/\_|_
        Reading 'parm' as Amber Topology
INPUT: Reading Input from STDIN
  [trajin ~/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz]
        Reading
'/home/aubndi/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz' as
Amber Trajectory
  [rms first .CA]
    RMSD: (.CA), reference is first frame (.CA), with fitting.
  [average V600E505H_sor_aMD_avg_pdb.pdb pdb]
    AVERAGE: Averaging over coordinates in mask [*]
        Start: 1 Stop: Final frame
        Writing averaged coords to [V600E505H_sor_aMD_avg_pdb.pdb]
  [createcrd crd1]
    CREATECRD: Saving coordinates from Top parm to "crd1"
  [run]
---------- RUN BEGIN -------------------------------------------------

PARAMETER FILES:
 0: 'parm', 36610 atoms, 11002 res, box: Orthogonal, 10731 mol, 10722
solvent, 60000 frames

INPUT TRAJECTORIES:
 0: 'V600E505H_sor_aMD_0_300ns.crd.gz' is an AMBER trajectory, Parm parm
(Orthogonal box) (reading 60000 of 60000)
  Coordinate processing will occur on 60000 frames.
TIME: Run Initialization took 0.0000 seconds.

BEGIN TRAJECTORY PROCESSING:
.....................................................
ACTION SETUP FOR PARM 'parm' (3 actions):
  0: [rms first .CA]
        Mask [.CA] corresponds to 272 atoms.
        Mask [.CA] corresponds to 272 atoms.
Warning: Coordinates are being rotated and box coordinates are present.
Warning: Unit cell vectors are NOT rotated; imaging will not be possible
Warning: after the RMS-fit is performed.
  1: [average V600E505H_sor_aMD_avg_pdb.pdb pdb]
        Averaging over 36610 atoms.
  2: [createcrd crd1]
----- V600E505H_sor_aMD_0_300ns.crd.gz (1-60000, 1) -----
 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.

Read 60000 frames and processed 60000 frames.
TIME: Trajectory processing: 2001.0162 s
TIME: Avg. throughput= 29.9848 frames / second.

ACTION OUTPUT:
    AVERAGE: [V600E505H_sor_aMD_avg_pdb.pdb pdb]
        Writing 'V600E505H_sor_aMD_avg_pdb.pdb' as PDB
  'V600E505H_sor_aMD_avg_pdb.pdb' is a PDB file, Parm parm: Writing 60000
frames (1-Last, 1)

DATASETS:
  2 data sets:
        RMSD_00000 "RMSD_00000" (double, rms), size is 60000
        crd1 "crd1" (coordinates), size is 60000 (25138 MB) Box Coords,
36610 atoms
---------- RUN END ---------------------------------------------------
  [reference V600E505H_sor_aMD_avg_pdb.pdb]
        Reading 'V600E505H_sor_aMD_avg_pdb.pdb' as PDB
        'V600E505H_sor_aMD_avg_pdb.pdb' is a PDB file, Parm parm (reading 1
of 1)
  [crdaction crd1 rms reference .CA]
        Mask [.CA] corresponds to 272 atoms.
    RMSD: (.CA), reference is reference frame V600E505H_sor_aMD_avg_pdb.pdb
(.CA), with fitting.
        Mask [.CA] corresponds to 272 atoms.
Warning: Coordinates are being rotated and box coordinates are present.
Warning: Unit cell vectors are NOT rotated; imaging will not be possible
Warning: after the RMS-fit is performed.
 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.
TIME: Total action execution time: 69.4644 seconds.
  [crdaction crd1 matrix out matrix.dat name mymatrix covar .CA]
    MATRIX: Calculating covariance matrix, output is by atom
            Printing to file matrix.dat
            Storing matrix on internal stack with name: mymatrix
        Start: 1 Stop: Final frame
            Mask1: .CA
        Mask [.CA] corresponds to 272 atoms.
 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Error: No eigenvectors data set
specified ('evecs <name>'). To load
Error: eigenvectors from a file use 'readdata <file>' prior to this
command.
        1 errors encountered reading input.
100% Complete.
TIME: Total action execution time: 102.7419 seconds.
  [runanalysis diagmatrix mymatrix out evecs.dat vecs 20 name mymode]
        Changed DataFile 'evecs.dat' type to Evecs file for set mymode
    DIAGMATRIX: Diagonalizing matrix mymatrix and writing modes to evecs.dat
        Calculating 20 eigenvectors
        Storing modes with name: mymode
TIME: Total analysis execution time: 0.2025 seconds.
  [readdata evecs.dat]
        Reading 'evecs.dat' as Evecs file with name 'evecs.dat'
        Reading modes from evecs.dat
        File contains 20 modes.
Warning: # modes to read (50) > modes in file. Only reading 20 modes.
  [crdaction crd1 projection mymodes evecs.dat out projection1-3.out start
1 stop 3 .CA myproj]
TIME: Total execution time: 2174.0333 seconds.



Thank you for your help,

Nicole


On Tue, Jul 29, 2014 at 10:07 AM, Nicole Ippolito <
ippolito.nicole89.gmail.com> wrote:

> Thank you for the detailed explanation and the reference. I believe it is
> working.
>
> Nicole
>
>
> On Tue, Jul 29, 2014 at 9:31 AM, Daniel Roe <daniel.r.roe.gmail.com>
> wrote:
>
>> PS - For a concrete example of performing PCA with cpptraj see
>> http://pubs.acs.org/doi/abs/10.1021/jp4125099 and supporting material
>> therein.
>>
>> -Dan
>>
>>
>> On Tue, Jul 29, 2014 at 8:16 AM, Daniel Roe <daniel.r.roe.gmail.com>
>> wrote:
>>
>> > Hi,
>> >
>> > You're missing some elements from your script, and have a few things out
>> > of order. By default in cpptraj when you specify an action or an
>> analysis
>> > it goes into either the action list or analysis list respectively; it is
>> > not immediately executed. I will go through your input and try to
>> explain
>> > what's happening:
>> >
>> > On Fri, Jul 25, 2014 at 11:56 AM, Nicole Ippolito <
>> > ippolito.nicole89.gmail.com> wrote:
>> >
>> >> trajin ~/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz
>> >> createcrd crd1
>> >> run
>> >>
>> >
>> > Here you are essentially loading a trajectory into the coords data set
>> > named 'crd1'. The 'run' statement means "process all currently loaded
>> > trajectories with any initialized actions", and 'createcrd' is the only
>> > action set up here.
>> >
>> >
>> >> rms first .CA
>> >> reference V600E505H_sor_aMD_avg_pdb.pdb
>> >>
>> >
>> > These statements seem out of order. They are placed after the run
>> > statement, and so they are not processed until the first 'run' is
>> > completed. Presumably your intent was to remove overall rotation and
>> > translation from your coordinates with the 'rms' statement. If that is
>> the
>> > case then it should come before the createcrd statement. Also, since you
>> > also have a later 'rms' command that fits to a reference, this makes me
>> > think this first 'rms' was for creating an average structure to use as a
>> > reference, and the 'average' command is missing.
>> >
>> > The 'reference' command is actually what kills the run: if you check
>> your
>> > output you will see the message "Error: reference: Could not set up
>> > trajectory". This indicates that "V600E505H_sor_aMD_avg_pdb.pdb" does
>> not
>> > exist; perhaps you meant to create an average?
>> >
>> >
>> >> crdaction crd1 rms reference .CA
>> >> crdaction crd1 matrix out matrix.dat name mymatrix covar .CA
>> >>
>> > runanalysis diagmatrix mymatrix out evecs.dat vecs 20 name mymodes
>> >>
>> >
>> > These statements seem fine as long as a reference is successfully loaded
>> > and your 'crd1' coordinates is properly loaded.
>> >
>> >
>> >> crdaction crd1 projection modes evecs.dat start 1 stop 3 .CA myproj
>> >>
>> >
>> > This will not work, since you named your modes internally as 'mymodes'
>> but
>> > you're trying to use the file name 'evecs.dat'. Change 'evecs.dat' to
>> > 'mymodes'.
>> >
>> >
>> >> hist myproj:1 bins 200 out PC.hist.agr
>> >> hist myproj:2 bins 200 out PC.hist.agr
>> >>
>> >
>> > This seems like it should work as well.
>> >
>> > So to summarize, your final script should look something like this:
>> >
>> > trajin ~/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz
>> > # Fit to first frame for averaging
>> > rms first .CA
>> > # Average the trajectory
>> > average V600E505H_sor_aMD_avg_pdb.pdb pdb
>> > # Save the rms-fit coords
>> > createcrd crd1
>> > # First run
>> > run
>> > # Load the generated average as reference
>> > reference V600E505H_sor_aMD_avg_pdb.pdb
>> > # Fit saved coords to the reference
>> > crdaction crd1 rms reference .CA
>> > # Calculate covariance matrix from fit coords
>> > crdaction crd1 matrix out matrix.dat name mymatrix covar .CA
>> > # Diagonalize matrix to get eigenvectors/eigenvalues
>> > runanalysis diagmatrix mymatrix out evecs.dat vecs 20 name mymodes
>> > # Project fit coords along eigenvectors 1-3
>> > crdaction crd1 projection modes evecs.dat start 1 stop 3 .CA myproj
>> > # Set up histograms for PC projections
>> > hist myproj:1 bins 200 out PC.hist.agr
>> > hist myproj:2 bins 200 out PC.hist.agr
>> > hist myproj:3 bins 200 out PC.hist.agr
>> > # Run analyses
>> > runanalysis
>> >
>> > Note that the final 'runanalysis' is not strictly necessary as cpptraj
>> > will try to run any actions or analyses that are set up before quitting
>> > when reading a script, but I include it for completeness.
>> >
>> > Hope this helps,
>> >
>> > -Dan
>> >
>> >
>> >
>> >
>> >>
>> >>
>> >> CPPTRAJ: Trajectory Analysis. V14.01
>> >> ___ ___ ___ ___
>> >> | \/ | \/ | \/ |
>> >> _|_/\_|_/\_|_/\_|_
>> >> Reading 'parm' as Amber Topology
>> >> INPUT: Reading Input from STDIN
>> >> [trajin ~/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz ]
>> >> Reading
>> >> '/home/aubndi/BRAF/MD/V600E505H/sor/V600E505H_sor_aMD_0_300ns.crd.gz'
>> as
>> >> Amber Trajectory
>> >> [createcrd crd1 ]
>> >> CREATECRD: Saving coordinates from Top parm to "crd1"
>> >> [run]
>> >> ---------- RUN BEGIN -------------------------------------------------
>> >>
>> >> PARAMETER FILES:
>> >> 0: 'parm', 36610 atoms, 11002 res, box: Orthogonal, 10731 mol, 10722
>> >> solvent, 60000 frames
>> >>
>> >> INPUT TRAJECTORIES:
>> >> 0: 'V600E505H_sor_aMD_0_300ns.crd.gz' is an AMBER trajectory, Parm
>> parm
>> >> (Orthogonal box) (reading 60000 of 60000)
>> >> Coordinate processing will occur on 60000 frames.
>> >> TIME: Run Initialization took 0.0000 seconds.
>> >>
>> >> BEGIN TRAJECTORY PROCESSING:
>> >> .....................................................
>> >> ACTION SETUP FOR PARM 'parm' (1 actions):
>> >> 0: [createcrd crd1 ]
>> >> ----- V600E505H_sor_aMD_0_300ns.crd.gz (1-60000, 1) -----
>> >> 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Error: reference: Could not
>> set up
>> >> trajectory.
>> >> 1 errors encountered reading input.
>> >> 100% Complete.
>> >>
>> >> Read 60000 frames and processed 60000 frames.
>> >> TIME: Trajectory processing: 1743.3814 s
>> >> TIME: Avg. throughput= 34.4159 frames / second.
>> >>
>> >> ACTION OUTPUT:
>> >>
>> >> DATASETS:
>> >> 1 data set:
>> >> crd1 "crd1" (coordinates), size is 60000 (25138 MB) Box Coords,
>> >> 36610 atoms
>> >> ---------- RUN END ---------------------------------------------------
>> >> [rms first .CA]
>> >> RMSD: (.CA), reference is first frame (.CA), with fitting.
>> >> [reference V600E505H_sor_aMD_avg_pdb.pdb]
>> >> TIME: Total execution time: 1743.8619 seconds.
>> >> _______________________________________________
>> >> AMBER mailing list
>> >> AMBER.ambermd.org
>> >> http://lists.ambermd.org/mailman/listinfo/amber
>> >>
>> >
>> >
>> >
>> > --
>> > -------------------------
>> > Daniel R. Roe, PhD
>> > Department of Medicinal Chemistry
>> > University of Utah
>> > 30 South 2000 East, Room 201
>> > Salt Lake City, UT 84112-5820
>> > http://home.chpc.utah.edu/~cheatham/
>> > (801) 587-9652
>> > (801) 585-6208 (Fax)
>> >
>>
>>
>>
>> --
>> -------------------------
>> Daniel R. Roe, PhD
>> Department of Medicinal Chemistry
>> University of Utah
>> 30 South 2000 East, Room 201
>> Salt Lake City, UT 84112-5820
>> http://home.chpc.utah.edu/~cheatham/
>> (801) 587-9652
>> (801) 585-6208 (Fax)
>> _______________________________________________
>> AMBER mailing list
>> AMBER.ambermd.org
>> http://lists.ambermd.org/mailman/listinfo/amber
>>
>
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jul 30 2014 - 07:00:02 PDT
Custom Search