Re: [AMBER] Davies Bouldin Index with DME in ptraj

From: Geoffrey Wood <gwood.MIT.EDU>
Date: Mon, 5 Jul 2010 13:15:26 -0400

Dear Jianyin,

Firstly, sorry I took some time to reply to this. However, I have again gone through the recipe that you outline below. And again by hand I can reproduce the DBI for rmsds but not DMEs
I tried for a trajectory that I clustered into a size of 2 so as to circumvent the max in fred(X) and even with a simple test like this I am unable to get the DME numbers.

To calculate the DBI by hand I used the two numbers from the clustering output i.e:

#Cluster 0 has average-distance-to-centroid 1.556713
#Cluster 1 has average-distance-to-centroid 1.454760

and then computed the DME, using the formula that you use, from centroid0 to centroid1. The DBI should then just be (1.55+1.45)/(the number I calc) however it is way off what is reported in the clustering output DBI: 2.34419 versus DBI: 8.6.

Again I reiterate doing the same thing with RMSDs I have no problem. This suggests that perhaps my DME formula is incorrect. However, I assure you that I am using the one from the 1994 Torda paper, which is the one you quote below.

Is there something obvious that I'm missing?

Thanks in advance,


On May 10, 2010, at 7:10 PM, Jianyin Shao wrote:

> Hi Geoffrey,
> The calculation of DBI when DME is used is same as the calculation of DBI
> when rmsd is used. The DBI is defined as the average, for all clusters X, of
> fred, where fred(X) = max, across other clusters Y, of (Cx + Cy)/dXY. Here
> Cx is the average distance from points in X to the centroid, similarly Cy,
> and dXY is
> the distance between cluster centroids:
> DBI = average(fred(X))
> fred(X) = max((C(X)+C(Y))/distance(X,Y))
> C(X) = average(distance(x, x-centroid))
> C(Y) = average(distance(y, y-centroid))
> distance(X,Y) = distance(x-centroid, y-centroid)
> Since you can reproduce the DBI value with rmsd metric, I guess the problem
> does not lie in the DBI calculation. Here I found two possible causes that
> DME might go wrong.
> 1. The centroid of a cluster using rmsd metric will implicitly aligned.
> There is no alignment before calculating the centroid when DME metric is
> used.
> 2. When calculating DME, we used formula:
> DME = sqrt(sum((distance(X,i,j)-distance(Y,i,j))^2)/C(N,2)), in which
> distance(X,i,j) is the distance between atom i and atom j in structure X.
> C(N,2) is N*(N-1)/2. But I've also seen formulas like this:
> DME = sqrt(sum(distance(X,i,j)-distance(Y,i,j))/C(N,2)) (sum up the
> distance error, not the square of distance error) or
> DME = sqrt(sum((distance(X,i,j)-distance(Y,i,j))^2)/N^2) (sum up all the
> pairs, including (i,i), which should be 0, then divided by the number of
> total pairs, N^2)
> I don't know how you calculate the DME. Hope this explanation helps.
> Best,
> Jianyin Shao
> On Mon, May 3, 2010 at 2:48 PM, Geoffrey Wood <> wrote:
>> Dear Amber Users,
>> I have been looking carefully at the clustering tools in ptraj in
>> particular
>> the cluster validity testing using the davies bouldin index (DBI). If my
>> set measure is RMSD then I am able to reproduce the DBI when I calculate it
>> "by hand" using ptraj... i.e taking the RMSD of each centroid with the
>> others and applying the DBI formula. However, when I take the set measure
>> to be the DME then I am unable to reproduce the values that ptraj
>> calculates. In this case ptraj doesn't have a DME command but it is easy
>> enough to use the distance command and calculate this matrix by hand. My
>> question is: how is ptraj calculating the DBI when the set measure is DME?
>> Thanks in advance, Geoffrey Wood
>> _______________________________________________
>> AMBER mailing list
> _______________________________________________
> AMBER mailing list

AMBER mailing list
Received on Mon Jul 05 2010 - 10:30:03 PDT
Custom Search