[AMBER] Kmeans-clustering : AvgDist query

From: Bala subramanian <bala.biophysics.gmail.com>
Date: Wed, 18 Jan 2017 11:51:04 +0100

Friends,

I tried Kmean clustering on a small data set (10 frames with rms similarity
measure). Since the cluster action produces a binary pairdist file, I
obtained the rmsd matrix with 2drmsd action (attached)

The Kmeans was run as,

rms2d * out 2drms-DUM.txt

cluster kmeans clusters 2 maxit 1000 rms * savepairdist pairdist
PAIRD-BIN.dat summary test.sum summarysplit test.sumspt info test.info out
test.cnvtime

*“The summary file”*

#Cluster Frames Frac *AvgDist* Stdev Centroid AvgCDist

0 6 0.600 *2.756* 1.446 3 5.771

1 4 0.400 *1.205* 0.458 10 5.771

*The info file*

#Clustering: 2 clusters 10 frames

#Cluster 0 has average-distance-to-centroid 1.918335

#Cluster 1 has average-distance-to-centroid 0.711903

#DBI: 0.455787

#pSF: 23.924226

#Algorithm: Kmeans nclusters 2 maxit 1000


*cnvtime file *

#Frame Cnum (frame 1 to 6 belongs to clus #0 and 7 to 10 to clust #1)

1 0 , 2 0 , 3 0 , 4 0 , 5 0 , 6 0 , 7 1 , 8 1 , 9 1 , 10 1

-------------------------------------------------------------------

Q1) From the rmsd matrix I calculated the AvgDist value of all the points
in cluster #0 and #1,and I get the values 3.749 (for #0) and 3.918 (#1).

But cpptraj reports a value of 2.756 and 1.205 (see above). From the rmsd
matrix, it is easy to guess that the AvgDist is likely to be greater than
2.7 and 1.2. Am I missing something in understanding cpptraj AvgDist ?.

Q2) Is there a way (in cpptraj) to dump the ascii pairdist file. I
converted the binary pairdist file to ascii format (using: hexdump -v -e
'10/4 "%06f "' -e '"\n"' PAIRD-BIN.dat > test.dat) and I get something like
pasted below. What do these trailing zeros mean ?

0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.333752
1.000338 1.891641 4.172873 1.580285 7.259061 5.991429 7.105317 6.895642
3.820035 3.972430 1.370469 3.610308 4.069707 2.705386 4.098798 3.704278
0.992006 4.522664 1.088604 7.753875 6.499359 7.547778 7.335218 4.575821
0.940676 7.760054 6.560380 7.492657 7.288205 4.467124 4.149098 2.685882
4.407049 3.787850 7.268991 6.144238 6.942125 6.818217 1.503415 0.979665
0.599390 1.988932 1.306563
0.851417

Thanks,
Bala






-- 
C. Balasubramanian



_______________________________________________
AMBER mailing list
AMBER.ambermd.org
http://lists.ambermd.org/mailman/listinfo/amber

Received on Wed Jan 18 2017 - 03:00:02 PST
Custom Search