I have been trying to compute the adjusted Rand index as by Hubert/ Arabie, and could not correctly approach how to define a partition object as in my last request yesterday.
With package fpc I try to work around the problem, using my original data:
mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T ) rownames(mat) <- paste("v", 1:4, sep="" )
## and the given partitions:
p1=c(1,1,1,2,2,2,3,3,3) p2=c(1,1,1,3,2,2,3,3,2) p3=c(1,2,1,3,1,3,1,3,2) p4=c(1,2,1,3,1,3,1,3,2)
cluster.stats(d=dist(mat), clustering=p1, alt.clustering=p2)
## just gives
Error in as.dist(dmat[clustering == i, clustering == i]) :
(subscript) logical subscript too long
I think I don't understand the use of 'd' here. How can I calculate the corrected Rand matrix:
( .000 .407 -.071 -.071) ( .407 .000 -.071 -.071) (-.071 -.071 .000 1.000) (-.071 -.071 1.000 .000)
Does the clue package help me here? Does anyone know if there is a VS-KM algorithm (Variable Selection Heuristic for K-Means Clustering) implemented in R? Unfortunately, I did not find any serach entries.
Thank you for your help and support
This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:57 EST