[Rd] (no subject)

From: stefano iacus <jago_at_mclink.it>
Date: Wed 01 Feb 2006 - 17:25:10 GMT

Suppose X is a data.frame with n obs and k vars, all variables are factors.

tab <- table(X)

containes a k-dim array

I would like to get a list from tab. This list is such that, each element contain the indexes corresponding to the observations which are in the same cell of this k-dim array. Of course, only for non empty cell.


> set.seed(123)
> X <- as.data.frame(matrix(rnorm(5000),100,5))
> X$V1 <- cut(X$V1, br=5)
> X$V2 <- cut(X$V2, br=5)
> X$V3 <- cut(X$V3, br=5)
> X$V4 <- cut(X$V4, br=5)
> X$V5 <- cut(X$V5, br=5)
> tab <- table(X)
> which(tab>0) -> cells
> length(cells)

[1] 94

thus, of course, 94 cells over 5^5 = 3125 are non empty. I would like a smart way (without reimplementing table/tabulate) to get the list of length 94 which contains the indexes of the obs in each cell
Or, viceversa, a vector of length n which tells, observation by observation, which cell (out of the 3125) the observation is in. stefano

