Re: [AMBER] Cluster RMSD

From: Daniel Roe <daniel.r.roe.gmail.com>
Date: Tue, 6 Jul 2021 09:14:45 -0400

Hi,

On Thu, Jul 1, 2021 at 1:36 AM Jenny 148 <jenny.rs140.gmail.com> wrote:
> The clusters are very randomly distributed (the major cluster being just
> 30% in one of the system and around 17-20 in others). But when I
> superimpose the representative structures of each of the system in PyMOL,
> it seems like they are not very different from the major cluster. I would

This is entirely possible. You specified a target epsilon and number
of clusters, so clustering will stop whenever one or the other is
reached. If all of your structures are very similar but you tell
cpptraj to give you 10 clusters, then hierarchical agglomerative
clustering will stop when there are 10 clusters left.

Long-time readers of the mailing list are probably tired of me saying
this by now, but clustering is really more of an art form than an
exact science. By that, I mean that clustering input that "works" for
one simulation may not (and in fact will likely not) work on another
simulation - it all depends on how the conformations in your ensemble
are distributed. When you perform cluster analysis you need to cluster
multiple times, and look at the resulting metrics (DBI, pseudo-F, etc)
to gauge whether your clustering settings are appropriate. As always,
I recommend reading (if you haven't already) what I consider to be the
definitive paper on clustering MD simulations by Shao & Cheatham et
al.: https://pubs.acs.org/doi/10.1021/ct700119m

> like to know if
> > it possible to obtain the exact RMSD between the 9 clusters with respect
> to the first cluster out of the total 10 clusters generated?

Sure - a good rough estimate of this is to do a 2D RMS on the
representative structures of each cluster. You can save a trajectory
consisting of only the cluster representative frames with the
'singlerepout' keyword (and associated keywords), then use the '2drms'
command to get the RMSD of each frame to each other frame.

> > If we could specify an RMS value of the mask we are using according to
> which the clustering is done?

Sorry, I'm not sure what you're asking here. Could you rephrase it or
give an example?

Hope some of this helps,

-Dan

> --
> Jenny R.S
> _______________________________________________
> AMBER mailing list
> AMBER.ambermd.org
> http://lists.ambermd.org/mailman/listinfo/amber

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Tue Jul 06 2021 - 06:30:02 PDT
Custom Search