Re: [AMBER] cpptraj: structures not aligned anymore after 'run' keyword

From: Daniel Roe <>
Date: Thu, 15 Jan 2015 09:24:30 -0700


Jason explained things really well. One other item of interest is that
you can also create COORDS data sets from a run (i.e. with 'trajin'
statements etc) using the createcrd command. This allows you do things
like create the average structure and save the coordinates for later
use in a single run, e.g.

# Rms fit to first and create average structure
parm myparm.parm7
rms :1-7&!.H= first
average avgstruct.rst7
createcrd MyCrd
# Now read average structure back in as reference.
# Note the '.1' suffix
reference avgstruct.rst7.1 [avg]
# Fit frames to average structure
crdaction MyCrd rms ref [avg] :1-7&!.H=
# Calculate and diagonalize covariance matrix
crdaction MyCrd matrix covar :1-7&!.H= name myCovar
runanalysis diagmatrix myCovar out evecs.dat vecs 5 name myEvecs
# Now create projections
crdaction MyCrd projection P1 modes myEvecs beg 1 end 5 :1-7&!.H=

Hope this helps as well,


PS - Your message (and Jason's response oddly enough) did indeed get
sent to my Gmail spam folder - I need to check that more often...

On Wed, Jan 14, 2015 at 8:03 PM, Jason Swails <> wrote:
> On Wed, Jan 14, 2015 at 8:06 AM, Guillaume Roellinger <>
> wrote:
>> Dear all,
>> During a PCA with cpptraj, for some reasons after the keyword 'run', the
>> structures from the MD trajectory are not aligned anymore even if I aligned
>> them with the keyword 'rms' before the 'run'. In case of PCA, this leads to
>> a wrong result for the projection: unaligned snapshots are projected onto
>> the PCs from evecs.dat.
>> I solved the problem by adding 'rms RMSD2 reference out RMSD2.dat .CA'
>> just after the 'run' and before 'readdata' but I wonder: Is this the normal
>> behavior of cpptraj to get back to the unaligned frames after the 'run'
>> keyword?
> Yes. I'll explain below. It helps to understand cpptraj's control flow
> here.
> -----------------------
>> -----------------------
>> parm myParm.prmtop
>> trajin myMDtraj.mdcrd parmindex 0
>> parm averageStructureCA.prmtop
>> reference averageStructureCA.pdb parmindex 1
>> rms RMSD reference out RMSD.dat .CA
>> matrix covar name cvmat out cvmat.dat .CA
>> diagmatrix cvmat out evecs.dat vecs 50 name mymodes
>> run
> OK. So at this point, what cpptraj does is read in each of the frames
> from the "trajin" structures one by one. It first passes each frame
> through the RMS command (which aligns it), and then "adds" it to the
> covariance matrix calculation, then that frame goes away (if you had made a
> "trajout" command, it would have been passed to that and been written to a
> trajectory file before going away). The aligned structure is never "saved"
> anywhere, because you never requested that it be saved anywhere. Once it
> is run, these actions have all been "performed", so they are discarded from
> the stack of actions that need to be applied to each frame.
>> readdata evecs.dat
>> projection modes evecs.dat out project.dat beg 1 end 50 .
>> CA
> OK. Here you read in the eigenvectors of the covariance matrix. Then you
> go through the trajectories specified by trajin once again, frame by frame,
> and feed them to the "projection" action. Note these are the original
> frames from the trajin file and the previous actions have already been
> performed and then discarded. In light of this control flow, the frames
> are quite clearly not aligned, so another RMSD action is required to align
> them.
> The only way that you could get *around* re-aligning every structure is if
> you somehow saved those structures directly in memory through the first
> pass (i.e., up until the first "run" statement), then used those frames in
> memory for your second pass. This would save having to do the RMSD
> calculation a second time, but would cost you the RAM required to save
> those coordinates in memory.
> It turns out, cpptraj actually allows you to do this. There is a "loadcrd"
> action that behaves like "trajin", except it loads all of the frames to
> memory as a data set (there is a "combinecrd" command that will allow you
> to combine the coordinates from multiple trajectory files into a *single*
> coordinate data set if you needed it). Then you can invoke every action
> (like rms and matrix) as a crdaction:
> parm myParm.prmtop
> loadcrd myMDtraj.mdcrd parmindex 0 crds
> parm averageStructureCA.prmtop
> reference averageStructureCA.pdb parmindex 1
> crdaction crds rms RMSD reference out RMSD.dat .CA
> crdaction crds matrix covar name cvmat out cvmat.dat .CA
> diagmatrix cvmat out evecs.dat vecs 50 name mymodes
> run
> readdata evecs.dat
> crdaction crds projection modes evecs.dat out project.dat beg 1 end 50 .CA
> In this case, "crdaction" is telling cpptraj to treat each action as an
> "analysis" over a set of coordinate frames stored in a dataset rather than
> an action over each frame pulled one at a time from the input
> trajectories. It uses a lot more memory (potentially way too much if you
> have *too* many frames), but significantly cuts down on the disk I/O (and
> RMSD calculations) that are needed for multiple passes through trajectories.
> Whew, long-winded, but hopefully it helps. I suspect Dan will chime in if
> I missed anything. I'm not sure why, but Gmail tried sending your message
> to my spam account (which may be the reason others haven't responded yet).
> Good luck,
> Jason
> --
> Jason M. Swails
> BioMaPS,
> Rutgers University
> Postdoctoral Researcher
> _______________________________________________
> AMBER mailing list

Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
(801) 587-9652
(801) 585-6208 (Fax)
AMBER mailing list
Received on Thu Jan 15 2015 - 08:30:03 PST
Custom Search