Re: [AMBER] Obtaining representative structures from clustering

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Fri, 29 Aug 2014 09:22:51 -0600

Hi,

On Fri, Aug 29, 2014 at 8:56 AM, Jonathan Gough
<jonathan.d.gough.gmail.com> wrote:
> When running a cluster command, the output lists a "representative frames"
> for each cluster. I am guessing that I could use that number and the
> origional trajectories to generate a .rst file and therefore a .pdb.
> If so, how does one (what is the best way to) do that (especially if I
> have more than one trajectory as the input)?

Representative frame numbering is relative to the total number of
frames read in. So if you have e.g. two trajectories of 100 frames
each that you clustered on, and your representative frames are 90,
120, and 150, you can get them by using 'trajout' with the
'onlyframes' keyword, e.g.

trajin traj1.nc
trajin traj2.nc
trajout rep.rst7 restart onlyframes 90,120,150

> In looking through the manual (section 28.13.1) I now see two commands that
> might have performed the task for me when I ran the calculation
> (singlereoout and repout)
> Can someone provide some insight on their use and the difference between
> the two?

I think the manual is fairly clear here:

"repout <repprefix>: Write representative frames to separate files
named <repprefix>.X.<ext>, where X is the cluster number and <ext> is
a format-specific filename extension.
singlerepout <trajfilename> Write all representative frames to single
trajectory named <trajfilename>."

So the only difference is that 'singlerepout' puts all of the
representatives in a single file, while 'repout' puts them in separate
files.

> Also, can these commands be used after a clustering calculation has been
> performed (eg. saving the processing time of re-running the clustering
> algorithm?

Not yet, although that support is planned. Your best bet is to use
'trajout onlyframes' as mentioned above. However, note that you can
save some time when re-running the same clustering calculation
multiple times by saving and reusing the pair-wise distance matrix via
the 'savepairdist' and 'loadpairdist' keywords. For most purposes it's
enough to just add the 'loadpairdist' keyword; this will trigger
saving the pairwise distance matrix file on the first run and re-read
it on subsequent runs. Be careful when using this though because if
you change your distance metric (by e.g. changing your rms mask) the
matrix will no longer be valid. Cpptraj can detect when the number of
frames in the matrix changes and recalculate, but currently has no way
to determine if more subtle things in the matrix change.

Hope this helps,

-Dan

>
>
> If it's helpful - here are the cpptraj commands I used to generate my
> clustering results.
>
> parm nowat.dimer.prmtop
> trajin prod01-24-dimer.nc
> trajin prod25-30-dimer.nc
> cluster C1 :1-164 clusters 10 sieve 10 epsilon 4.0 out cnumvtime.dat
> summary avg.summary.dat clusterout cl clusterfmt netcrd
> run
> quit
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber



-- 
-------------------------
Daniel R. Roe, PhD
Department of Medicinal Chemistry
University of Utah
30 South 2000 East, Room 307
Salt Lake City, UT 84112-5820
http://home.chpc.utah.edu/~cheatham/
(801) 587-9652
(801) 585-6208 (Fax)
_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Fri Aug 29 2014 - 08:30:02 PDT
Custom Search