[R] Clustering and Rand Index

From: Mark Hempelmann <neo27_at_t-online.de>
Date: Sat 07 Jan 2006 - 23:38:30 EST

Dear WizaRds,

I am trying to compute the (adjusted) Rand Index in order to comprehend the variable selection heuristic (VS-KM) according to Brusco/ Cradit 2001 (Psychometrika 66 No.2 p.249-270, 2001).

Unfortunately, I am unable to correctly use cl_ensemble and cl_agreement (package: clue). Here is what I am trying to do:


## Let p1..p4 be four partitions of the kind


Each object within the partitions is assigned to cluster 1,2,3 respectively. Now I have to create a cl_ensemble object, so that I can calculate the Rand index:

ens <- cl_ensemble(list=c(p1,p2,p3,p4))

which only leads to
"Ensemble elements must be all partitions or all hierarchies."

Although I understand that p1..p4 are vectors in this example, they represent the partitions I want to use. I don't know how to create the necessary partition object in order to transform it into an ensemble object, so that I can run cl_agreement - so much transformation, so little time...

I have also tried to work around this prbl, creating partitions via k-means, but I do not get the same partitions I need to validate. I am sure the following algorithm needs improvement, especially the use of putting matrices into a list through a for loop (ouch) - I am very grateful for your comments of improving this terrible piece of R-work (is it easier to do sthg with apply?).

Thank you very much for your help and support Mark

mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2, 15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T ) rownames(mat) <- paste("v", 1:4, sep="" )

clus.mat <- vector(mode="list", length=4) for (i in 1:4){

        clus.mat[[i]] <- kmeans(mat[i,], centers=3, nstart=1, algorithm="MacQueen") ## run kmeans on each row (clustering per single variable)


R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jan 07 23:46:09 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:41:54 EST