Re: [AMBER] Obtaining representative structures from clustering

From: Jonathan Gough <jonathan.d.gough.gmail.com>
Date: Fri, 29 Aug 2014 11:40:10 -0400

As always, very helpful. Thanks Dan!
On Aug 29, 2014 11:23 AM, "Daniel Roe" <daniel.r.roe.gmail.com> wrote:

> Hi,
>
> On Fri, Aug 29, 2014 at 8:56 AM, Jonathan Gough
> <jonathan.d.gough.gmail.com> wrote:
> > When running a cluster command, the output lists a "representative
> frames"
> > for each cluster. I am guessing that I could use that number and the
> > origional trajectories to generate a .rst file and therefore a .pdb.
> > If so, how does one (what is the best way to) do that (especially if I
> > have more than one trajectory as the input)?
>
> Representative frame numbering is relative to the total number of
> frames read in. So if you have e.g. two trajectories of 100 frames
> each that you clustered on, and your representative frames are 90,
> 120, and 150, you can get them by using 'trajout' with the
> 'onlyframes' keyword, e.g.
>
> trajin traj1.nc
> trajin traj2.nc
> trajout rep.rst7 restart onlyframes 90,120,150
>
> > In looking through the manual (section 28.13.1) I now see two commands
> that
> > might have performed the task for me when I ran the calculation
> > (singlereoout and repout)
> > Can someone provide some insight on their use and the difference
> between
> > the two?
>
> I think the manual is fairly clear here:
>
> "repout <repprefix>: Write representative frames to separate files
> named <repprefix>.X.<ext>, where X is the cluster number and <ext> is
> a format-specific filename extension.
> singlerepout <trajfilename> Write all representative frames to single
> trajectory named <trajfilename>."
>
> So the only difference is that 'singlerepout' puts all of the
> representatives in a single file, while 'repout' puts them in separate
> files.
>
> > Also, can these commands be used after a clustering calculation has
> been
> > performed (eg. saving the processing time of re-running the clustering
> > algorithm?
>
> Not yet, although that support is planned. Your best bet is to use
> 'trajout onlyframes' as mentioned above. However, note that you can
> save some time when re-running the same clustering calculation
> multiple times by saving and reusing the pair-wise distance matrix via
> the 'savepairdist' and 'loadpairdist' keywords. For most purposes it's
> enough to just add the 'loadpairdist' keyword; this will trigger
> saving the pairwise distance matrix file on the first run and re-read
> it on subsequent runs. Be careful when using this though because if
> you change your distance metric (by e.g. changing your rms mask) the
> matrix will no longer be valid. Cpptraj can detect when the number of
> frames in the matrix changes and recalculate, but currently has no way
> to determine if more subtle things in the matrix change.
>
> Hope this helps,
>
> -Dan
>
> >
> >
> > If it's helpful - here are the cpptraj commands I used to generate my
> > clustering results.
> >
> > parm nowat.dimer.prmtop
> > trajin prod01-24-dimer.nc
> > trajin prod25-30-dimer.nc
> > cluster C1 :1-164 clusters 10 sieve 10 epsilon 4.0 out cnumvtime.dat
> > summary avg.summary.dat clusterout cl clusterfmt netcrd
> > run
> > quit
> > _______________________________________________
> > AMBER mailing list
> > AMBER.ambermd.org
> > http://lists.ambermd.org/mailman/listinfo/amber
>
>
>
> --
> -------------------------
> Daniel R. Roe, PhD
> Department of Medicinal Chemistry
> University of Utah
> 30 South 2000 East, Room 307
> Salt Lake City, UT 84112-5820
> http://home.chpc.utah.edu/~cheatham/
> (801) 587-9652
> (801) 585-6208 (Fax)
>
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber
>
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Aug 29 2014 - 09:00:03 PDT
Custom Search