Re: [AMBER] Trouble understanding DBSCAN clustering algorithm from Juan Eiros Zamora on 2015-05-07 (Amber Archive May 2015)

From: Juan Eiros Zamora <j.eiros-zamora14.imperial.ac.uk>
Date: Thu, 07 May 2015 10:29:34 +0100

Thanks Christina and Daniel for your comments,

I have one last question on the following comment

On 06/05/15 17:49, Daniel Roe wrote:
> What I would do in your case is save the pairwise distance matrix with 'savepairdist' so
> I can use it over and over, cluster at least 3-4 times using different
> values of K and epsilon, then compare the results using metrics like
> DBI, pseudo-F, silhouette etc.
>

I have read the cluster documentation in the manual but I'm failing to
understand how the pairwise distance matrix works for the clustering. In
one of the examples, the clustering is done using the "distance" command
between two residues, and then clustering based on that distance (if I'm
not mistaken) as such:

Example: cluster on a specific distance:
distance endToEnd :1 :255
cluster data endToEnd clusters 10 epsilon 3.0 summary summary.dat info
info.dat

If I want to cluster only a specific region of my system (residues 232
to 248, for instance), I should not follow the example above right? So,
the first time the command would be something like this:

cluster dbscan minpoints 4 epsilon 3.5 rms savepairdist pardist matrixfile

And then for the subsequent clustering trials I should use

cluster dbscan minpoints 4 epsilon X rms loadpairdist matrixfile #Change
X 3 or 4 times and compare

Does the pairwise matrix that is saved vary based on the cluster
algorithm that is used? If not, could I use the same parwise distance
matrix to try out different algorithms?

Thank you for your time,

Juan

_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber
Received on Thu May 07 2015 - 02:30:03 PDT