Re: [AMBER] cpptraj: structures not aligned anymore after 'run' keyword

From: Jason Swails <jason.swails.gmail.com>
Date: Wed, 14 Jan 2015 22:03:50 -0500

On Wed, Jan 14, 2015 at 8:06 AM, Guillaume Roellinger <roellinger.bio.mx>
wrote:

> Dear all,
>
> During a PCA with cpptraj, for some reasons after the keyword 'run', the
> structures from the MD trajectory are not aligned anymore even if I aligned
> them with the keyword 'rms' before the 'run'. In case of PCA, this leads to
> a wrong result for the projection: unaligned snapshots are projected onto
> the PCs from evecs.dat.
>
> I solved the problem by adding 'rms RMSD2 reference out RMSD2.dat @CA'
> just after the 'run' and before 'readdata' but I wonder: Is this the normal
> behavior of cpptraj to get back to the unaligned frames after the 'run'
> keyword?
>

​Yes. I'll explain below. It helps to understand cpptraj's control flow
here.

-----------------------
>
> PCA.in
>
> -----------------------
>
> parm myParm.prmtop
>
> trajin myMDtraj.mdcrd parmindex 0
> parm averageStructureCA.prmtop
> reference averageStructureCA.pdb parmindex 1
> rms RMSD reference out RMSD.dat @CA
> matrix covar name cvmat out cvmat.dat @CA
> diagmatrix cvmat out evecs.dat vecs 50 name mymodes
>
> run
>

​OK. So at this point, what cpptraj does is read in each of the frames
from the "trajin" structures one by one. It first passes each frame
through the RMS command (which aligns it), and then "adds" it to the
covariance matrix calculation, then that frame goes away (if you had made a
"trajout" command, it would have been passed to that and been written to a
trajectory file before going away). The aligned structure is never "saved"
anywhere, because you never requested that it be saved anywhere. Once it
is run, these actions have all been "performed", so they are discarded from
the stack of actions that need to be applied to each frame.



> ​​
> readdata evecs.dat
> ​​
> projection modes evecs.dat out project.dat beg 1 end 50 @
> ​​
> CA
>

​OK. Here you read in the eigenvectors of the covariance matrix. Then you
go through the trajectories specified by trajin once again, frame by frame,
and feed them to the "projection" action. Note these are the original
frames from the trajin file and the previous actions have already been
performed and then discarded. In light of this control flow, the frames
are quite clearly not aligned, so another RMSD action is required to align
them.

The only way that you could get *around* re-aligning every structure is if
you somehow saved those structures directly in memory through the first
pass (i.e., up until the first "run" statement), then used those frames in
memory for your second pass. This would save having to do the RMSD
calculation a second time, but would cost you the RAM required to save
those coordinates in memory.

It turns out, cpptraj actually allows you to do this. There is a "loadcrd"
action that behaves like "trajin", except it loads all of the frames to
memory as a data set (there is a "combinecrd" command that will allow you
to combine the coordinates from multiple trajectory files into a *single*
coordinate data set if you needed it). Then you can invoke every action
(like rms and matrix) as a crdaction:

parm myParm.prmtop
​​
​loadcrd myMDtraj.mdcrd parmindex 0 crds​
parm averageStructureCA.prmtop
reference averageStructureCA.pdb parmindex 1
crdaction crds rms RMSD reference out RMSD.dat @CA
crdaction crds matrix covar name cvmat out cvmat.dat @CA
diagmatrix cvmat out evecs.dat vecs 50 name mymodes

run

readdata evecs.dat
crdaction crds projection modes evecs.dat out project.dat beg 1 end 50 @CA

In this case, "crdaction" is telling cpptraj to treat each action as an
"analysis" over a set of coordinate frames stored in a dataset rather than
an action over each frame pulled one at a time from the input
trajectories. It uses a lot more memory (potentially way too much if you
have *too* many frames), but significantly cuts down on the disk I/O (and
RMSD calculations) that are needed for multiple passes through trajectories.

Whew, long-winded, but hopefully it helps. I suspect Dan will chime in if
I missed anything. I'm not sure why, but Gmail tried sending your message
to my spam account (which may be the reason others haven't responded yet).

Good luck,
Jason

-- 
Jason M. Swails
BioMaPS,
Rutgers University
Postdoctoral Researcher
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Wed Jan 14 2015 - 19:30:02 PST
Custom Search