From: Steven Wolf <wolfste4_at_msu.edu>
Date: Wed, 23 Mar 2011 12:40:55 -0400

I'm attempting to use the Adjusted Rand Index to compare different categorizations in my card-sorting experiment. However, as I am attempting to replicate a prior study, I am allowing them to put a single card in multiple piles. However, in the original paper, it looks like Rand expects the cards to be placed into "disjoint" sets. I'm wondering if there is a workaround to this problem.

As an example, suppose that you have these two categorizations:

Reviewer 1:

Cat 1 - Item #s 1,3,5

Cat 2 - Item #s 2,4

Reviewer 2:

Cat 1 - Item #s 1,2,3

Cat 2 - Item #s 4,5

#You then convert these into a vector:

r1<-c(1,2,1,2,1)

r2<-c(1,1,1,2,2)

#There are two algorithms that can calculate the adjusted rand index

library(mclust)

library(mcclust)

.easy as pie

As an example, I have data that looks like this:

Reviewer 1:

Cat 1 - Item #s 1,3,5

Cat 2 - Item #s 2,4

Cat 3 - Item #s 1,4

Reviewer 2:

Cat 1 - Item #s 1,2,3

Cat 2 - Item #s 4,5

However, because of the double categorization for Reviewer 1, it is not trivial to create the vector for reviewer 1. Simply doing:

r1<-c(c(1,3),2,1,c(2,3),1)

won't work because the input vectors need to be of the same length (and that doesn't do what I want it to either).

Is there a way to implement this so that these algorithms will still work?

-Steve

