Re: [AMBER] PCA with a really giant trajectory

From: David Cerutti <dscerutti.gmail.com>
Date: Tue, 28 May 2019 13:49:15 -0400

Indeed, I may be pushed to sampling the trajectory less frequently, and
with the length scales I can now go to saving every 1ns isn't out of the
question. For the purposes of disk space limitations, I may need to do
that anyway. But I was hoping that there was some workaround, because in
principle the coordinates do not all need to be in RAM at once for PCA
computations.

Dave


On Mon, May 27, 2019 at 3:29 AM Chris Neale <candrewn.gmail.com> wrote:

> Dear David:
>
> I realize that I am not directly answering your question, but do you really
> need to run PCA on a trajectory that is saved every 50 ps? That seems
> awfuly frequent for a long trajectory. In many situations, I suspect that
> the same correlated motions will be picked up with a larger step size
> between frames. You might test this equivlence by running PCA with a skip
> 1000, skip 100, and skip 10 version (or whatever you can afford) and then
> comparing the change in top-ranked eigenvectors as a function of dt. If
> your argument is that you absolutely must see the result when saved every
> 50 ps, then what is the argument that you do not need to run PCA on frames
> saved every 5 ps? Sorry that I do not have any direct suggestions for code
> modifications.
>
> Thank you,
> Chris.
>
> On Sat, May 25, 2019 at 1:32 PM David Cerutti <dscerutti.gmail.com> wrote:
>
> > Can anyone on the list help me do PCA with a really giant trajectory? I
> > have 35GB now and will have more than 200GB by the time this is all done.
> > The frames are being saved every 50ps and this is an implicit solvent
> > trajectory, so it's not going to be much use to strip the coordinates or
> > reduce the output rate if I want to keep gathering the relevant data.
> What
> > I'm working with is a file that looks like this, taken from the cpptraj
> > manual:
> >
> > trajin ../Trajectory/md1_1.cdf
> > ( ... many more trajin commands ...)
> > trajin ../Trajectory/md112_22.cdf
> > rms first !.H=
> > average crdset AVG
> > run
> > rms ref AVG !.H=
> > matrix covar name MyMatrix !.H=
> > createcrd CRD1
> > run
> > runanalysis diagmatrix MyMatrix vecs 2 name MyEvecs
> > crdaction CRD1 projection evecs MyEvecs !.H= out project.dat beg 1 end 2
> > go
> >
> > The problem, it seems is that createcrd CRD1 line. When doing that, it
> > commits all coords to memory, limiting the amount of trajectory you can
> > analyze to the amount of system RAM. Otherwise, it seems that I can
> > compute the average positions AVG and compose the covariance matrix while
> > reading each frame from disk, without storing the entire trajectory in
> RAM.
> >
> > I believe that if I had a way to store the matrix (which cpptraj
> provides)
> > and then READ IT BACK IN, I could compose the covariance matrix for the
> > entire trajectory and save the average coordinates. I could then read
> > segments of the trajectory, read back the averaged coordinates, load the
> > matrix and diagonalize it, align each frame from the trajectory segment
> to
> > the average from the complete trajectory, and calculate each frame's
> > projection onto the matrix eigenvectors.
> >
> > The only other alternative I could see here would be to use cpptraj to
> > compute the averaged coordinates and save them along with the covariance
> > matrix. Matlab could diagonalize the matrix and give me the
> eigenvectors.
> > I could then proceed segment by segment, using cpptraj to align the
> > trajectory coordinates to that average and write the aligned coordinates
> > back to disk. A secondary jiffy program could then read the aligned
> > coordinates and compare them to each eigenvector to calculate the
> > projections and thus give me the PCA. It would be a duct-tape solution,
> > but one that is possible if that's what I need to do.
> >
> > Cheers,
> > Dave
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
> >
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue May 28 2019 - 11:00:04 PDT
Custom Search