- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Sruthi Sudhakar <sruthisudhakarraji.gmail.com>

Date: Wed, 16 Jun 2021 09:27:22 +0530

Could someone please suggest on the above issue, I tried rms fitting the 2

trajectories onto one. But it is being difficult and I get the error as

instead of having total frames (traj1+ traj2) it has only total number of

frames as trajectory 1. A help in refining the script would be really

helpful.

Regards,

Sruthi Sudhakar

On Sun, Jun 13, 2021 at 2:46 AM Sruthi Sudhakar <

sruthisudhakarraji.gmail.com> wrote:

*>
*

*> This method of breaking the analysis into stages actually worked and I was
*

*> able to complete the analysis overcoming the memory issues.
*

*> An additional query: Incase we are using 2 different trajectories with
*

*> different number of residues (1530 and 1526) to do a combined trajectory
*

*> analysis, would it be enough to load gamd2.nc (1530 residues) and gamd3.nc (1526
*

*> residues) together? Each has 250000 frames. And in that case do we need to
*

*> fit into a single trajectory (fit.nc) or separately create 2 fit
*

*> trajectory? I am doubtful about the third stage where we load the fit
*

*> trajectory, should we load both the parm files in that case? Also would the
*

*> difference in the total number of residues create any problem?
*

*>
*

*> My sample script following your guidelines would be
*

*> pca1.in
*

*> parm noionsa.prmtop [ag]
*

*> parm noionsb.prmtop [bg]
*

*> trajin gamd2.nc parm [ag]
*

*> trajin gamd3.nc parm [bg]
*

*> # Fit to the first frame to remove global rotation/translation
*

*> rms first :1-1362.CA
*

*> # Write out the restart
*

*> average gaccAvg.rst7 restart
*

*>
*

*> pca2.in
*

*> parm noionsa.prmtop [ag]
*

*> parm noionsb.prmtop [bg]
*

*> trajin gamd2.nc parm [ag]
*

*> trajin gamd3.nc parm [bg]
*

*> # Read in average structure, tag as [avg]
*

*> reference gaccAvg.rst7 [avg]
*

*> # RMS-fit coordinates to average
*

*> rms ref [avg] :1-1362.CA
*

*> # Calculate coordinate covariance matrix
*

*> matrix covar :1-1362.CA name gaccCovar
*

*> # Write out the fit trajectory
*

*> trajout fit.nc
*

*> # Diagonalize coordinate covariance matrix for eigenmodes (only 2)
*

*> diagmatrix gaccCovar out evecs.dat vecs 20 nmwiz nmwizvecs 5
*

*> nmwizfile prot.nmd nmwizmask :1-1362.CA
*

*>
*

*> pca3.in
*

*>
*

*> parm noionsa.prmtop
*

*> parm noionsb.prmtop
*

*> trajin fit.nc
*

*> # Read in modes
*

*> readdata evecs.dat name MyModes
*

*> # Now create separate PC projections for each trajectory
*

*> projection T1 modes MyModes beg 1 end 20 :1-1362.CA start 1 stop 250000
*

*> out T1.dat
*

*> projection T2 modes MyModes beg 1 end 20 :1-1362.CA start 250001 stop
*

*> 500000 out T2.dat
*

*> # Calculate Kullback-Leibler Divergence vs time for PC histograms
*

*> kde T1:1 kldiv T2:1 klout KL-PC1.dat bins 200 name SG-DG-1
*

*> kde T1:2 kldiv T2:2 klout KL-PC2.dat bins 200 name SG-DG-2
*

*> kde T1:1 out kde1-PC1.dat bins 200 name KDE1-1
*

*> kde T2:1 out kde2-PC1.dat bins 200 name KDE2-1
*

*> hist T1:1 bins 200 out T1-hist1.dat normint name HIST1-1
*

*> hist T1:2 bins 200 out T1-hist2.dat normint name HIST1-2
*

*> hist T2:1 bins 200 out T2-hist1.dat normint name HIST2-1
*

*> hist T2:2 bins 200 out T2-hist2.dat normint name HIST2-2
*

*>
*

*> Kindly suggest any corrections required.
*

*> Regards,
*

*> Sruthi Sudhakar
*

*>
*

*>
*

*> On Sat, Jun 12, 2021 at 12:11 PM Sruthi Sudhakar <
*

*> sruthisudhakarraji.gmail.com> wrote:
*

*>
*

*>> This method actually worked and I was able to complete the analysis. An
*

*>> additional query: Incase we are using 2 different trajectories to do a
*

*>> combined trajectory analysis, would it be enough to load gamd2.nc and
*

*>> gamd3.nc together? And in that case do we need to fit into a single
*

*>> trajectory (fit.nc) or separately create 2 fit trajectory?
*

*>>
*

*>> Regards,
*

*>> Sruthi Sudhakar
*

*>>
*

*>>
*

*>> On Wed, Jun 2, 2021 at 11:32 PM Daniel Roe <daniel.r.roe.gmail.com>
*

*>> wrote:
*

*>>
*

*>>> Hi,
*

*>>>
*

*>>> First, if you haven't already I really recommend that you look at some
*

*>>> literature to familiarize yourself with what principal component
*

*>>> analysis actually does. It's important to understand what each part of
*

*>>> the analysis is for. I like the old "essential dynamics" papers, but a
*

*>>> very rough explanation is given in the cpptraj PCA tutorial:
*

*>>>
*

*>>> https://amberhub.chpc.utah.edu/introduction-to-principal-component-analysis/
*

*>>>
*

*>>> Also, read the cpptraj manual so you know what each command is actually
*

*>>> doing.
*

*>>>
*

*>>> That said, I now realize my examples in the manual and the tutorial
*

*>>> both rely on having the trajectories stored in memory. So I will give
*

*>>> a rough outline of how to do it all on-disk.
*

*>>>
*

*>>> 1) Create average coordinates.
*

*>>>
*

*>>> parm noions.prmtop
*

*>>> trajin gamd2.nc
*

*>>> # Fit to the first frame to remove global rotation/translation
*

*>>> rms first :1-1530&!.H=
*

*>>> # Write out the restart
*

*>>> average gaccAvg.rst7 restart
*

*>>>
*

*>>> 2) Rms fit your trajectory to average coordinates, calculate the
*

*>>> covariance matrix, write out the fit trajectory, diagonalize the
*

*>>> matrix and write out the "modes" data (i.e. eigenvectors and
*

*>>> eigenvalues).
*

*>>>
*

*>>> parm noions.prmtop
*

*>>> trajin gamd2.nc
*

*>>> # Read in average structure, tag as [avg]
*

*>>> reference gaccAvg.rst7 [avg]
*

*>>> # RMS-fit coordinates to average
*

*>>> rms ref [avg] :1-1530&!.H=
*

*>>> # Calculate coordinate covariance matrix
*

*>>> matrix covar :1-1530&!.H= name gaccCovar
*

*>>> # Write out the fit trajectory
*

*>>> trajout fit.nc
*

*>>> # Diagonalize coordinate covariance matrix for eigenmodes (only 2)
*

*>>> diagmatrix gaccCovar out evecs.dat vecs 2 nmwiz nmwizvecs 2 nmwizfile
*

*>>> cas.nmd nmwizmask :1-1530&!.H=
*

*>>>
*

*>>>
*

*>>> 3) Read in the "modes" data, calculate principal component projections
*

*>>> from the fit trajectory, do the Kullback-Leibler divergence analysis.
*

*>>>
*

*>>> parm noions.prmtop
*

*>>> trajin fit.nc
*

*>>> # Read in modes
*

*>>> readdata evecs.dat name MyModes
*

*>>> # Now create separate PC projections for each trajectory
*

*>>> projection T1 modes MyModes beg 1 end 2 :1-1530&!.H= start 1 stop
*

*>>> 50000 out T1.dat
*

*>>> projection T2 modes MyModes beg 1 end 2 :1-1530&!.H= start 50001 stop
*

*>>> 100000 out T2.dat
*

*>>> <other projection commands>
*

*>>> # Calculate Kullback-Leibler Divergence vs time for PC histograms
*

*>>> kde T1:1 kldiv T2:1 klout KL-PC.agr bins 400 name AMD-MREMD-1
*

*>>> <other kde commands and analyses>
*

*>>>
*

*>>> Hope this helps,
*

*>>>
*

*>>> -Dan
*

*>>>
*

*>>> On Tue, Jun 1, 2021 at 3:23 PM Sruthi Sudhakar
*

*>>> <sruthisudhakarraji.gmail.com> wrote:
*

*>>> >
*

*>>> > The available memory in the beginning of cpptraj is shown as 32 Gb and
*

*>>> > estimated memory usage is 82gb. I am really confused about this memory
*

*>>> > allotment statistics. I am doing the process in a disk with more than
*

*>>> 3TB
*

*>>> > space. I am not well versed with this technicality. Could someone
*

*>>> please
*

*>>> > exaplain how to overcome the issue in this principal component analysis
*

*>>> > part? I did understand that we have to separate the analysis into 3
*

*>>> phases
*

*>>> > but not clear as to how the inputs should be changed. Kindly advise.
*

*>>> >
*

*>>> > On Tue, 1 Jun 2021 at 7:21 PM, Sruthi Sudhakar <
*

*>>> sruthisudhakarraji.gmail.com>
*

*>>> > wrote:
*

*>>> >
*

*>>> > >
*

*>>> > > Thank you for the reply. Since I am doing this for the first time, I
*

*>>> > > wanted to know if I am supposed to create 3 separate inputs to run
*

*>>> in
*

*>>> > > cpptraj to do the methodology you suggested.
*

*>>> > >
*

*>>> > > Regards,
*

*>>> > > Sruthi
*

*>>> > >
*

*>>> > > On Tue, 1 Jun 2021 at 6:33 PM, Daniel Roe <daniel.r.roe.gmail.com>
*

*>>> wrote:
*

*>>> > >
*

*>>> > >> Hi,
*

*>>> > >>
*

*>>> > >> You are likely running out of memory. This is why your problems
*

*>>> during
*

*>>> > >> clustering went away when you reduced the number of input frames by
*

*>>> a
*

*>>> > >> factor of 5. The solution is to do everything on disk. So instead of
*

*>>> > >> loading all coordinates into memory, separate the principal
*

*>>> component
*

*>>> > >> analysis into three separate phases:
*

*>>> > >>
*

*>>> > >> 1) Create average coordinates.
*

*>>> > >> 2) Rms fit your trajectory to average coordinates, calculate the
*

*>>> > >> covariance matrix, write out the fit trajectory, diagonalize the
*

*>>> > >> matrix and write out the "modes" data (i.e. eigenvectors and
*

*>>> > >> eigenvalues).
*

*>>> > >> 3) Read in the "modes" data, calculate principal component
*

*>>> projections
*

*>>> > >> from the fit trajectory, do the Kullback-Leibler divergence
*

*>>> analysis.
*

*>>> > >>
*

*>>> > >> This way, the most memory you need is to store the covariance
*

*>>> matrix,
*

*>>> > >> modes, and other data of that type. Hope this helps,
*

*>>> > >>
*

*>>> > >> -Dan
*

*>>> > >>
*

*>>> > >> PS - Note that there appears to be a small error in the input you
*

*>>> > >> posted (in pca.in). The 'nmwiz' keyword should be part of the
*

*>>> > >> diagmatrix command, not on a separate line.
*

*>>> > >>
*

*>>> > >>
*

*>>> > >> On Mon, May 31, 2021 at 1:22 PM Sruthi Sudhakar
*

*>>> > >> <sruthisudhakarraji.gmail.com> wrote:
*

*>>> > >> >
*

*>>> > >> > Dear all,
*

*>>> > >> >
*

*>>> > >> > I have been doing pca analysis on an accelerated MD trajectory of
*

*>>> 500ns
*

*>>> > >> > (250,000 frames). I have attached the input file I have used for
*

*>>> the
*

*>>> > >> study.
*

*>>> > >> > The job stops at the createcrd stage. Basically, the job gets
*

*>>> killed at
*

*>>> > >> > 30%. The same happened during the cluster analysis reading every
*

*>>> frames.
*

*>>> > >> > The clustering error was solved when I changed the input to read
*

*>>> every
*

*>>> > >> 5th
*

*>>> > >> > frame. Now since this is repeating in pca analysis, kindly help
*

*>>> > >> regarding
*

*>>> > >> > the same.
*

*>>> > >> >
*

*>>> > >> >
*

*>>> > >> > Regards,
*

*>>> > >> > Sruthi Sudhakar
*

*>>> > >> > _______________________________________________
*

*>>> > >> > AMBER mailing list
*

*>>> > >> > AMBER.ambermd.org
*

*>>> > >> > http://lists.ambermd.org/mailman/listinfo/amber
*

*>>> > >>
*

*>>> > >> _______________________________________________
*

*>>> > >> AMBER mailing list
*

*>>> > >> AMBER.ambermd.org
*

*>>> > >> http://lists.ambermd.org/mailman/listinfo/amber
*

*>>> > >>
*

*>>> > >
*

*>>> > _______________________________________________
*

*>>> > AMBER mailing list
*

*>>> > AMBER.ambermd.org
*

*>>> > http://lists.ambermd.org/mailman/listinfo/amber
*

*>>>
*

*>>> _______________________________________________
*

*>>> AMBER mailing list
*

*>>> AMBER.ambermd.org
*

*>>> http://lists.ambermd.org/mailman/listinfo/amber
*

*>>>
*

*>>
*

_______________________________________________

AMBER mailing list

AMBER.ambermd.org

http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Jun 15 2021 - 21:00:02 PDT

Date: Wed, 16 Jun 2021 09:27:22 +0530

Could someone please suggest on the above issue, I tried rms fitting the 2

trajectories onto one. But it is being difficult and I get the error as

instead of having total frames (traj1+ traj2) it has only total number of

frames as trajectory 1. A help in refining the script would be really

helpful.

Regards,

Sruthi Sudhakar

On Sun, Jun 13, 2021 at 2:46 AM Sruthi Sudhakar <

sruthisudhakarraji.gmail.com> wrote:

_______________________________________________

AMBER mailing list

AMBER.ambermd.org

http://lists.ambermd.org/mailman/listinfo/amber

Received on Tue Jun 15 2021 - 21:00:02 PDT

Custom Search