Date: Thu 09 Feb 2006 - 08:34:07 EST

*> From: =?iso-8859-1?q?Lu=EDs_Torgo?= <ltorgo@liacc.up.pt>
**> Date: Wed, 8 Feb 2006 18:08:40 +0000
> Dear list,

*> I've recently came across a problem that I think I've solved and that I wanted
**> to share with you for two reasons:
**> - Maybe others come across the same problem.
**> - Maybe someone has a much simpler solution that wants to share with me ;-)
**> The problem is as follows: expand.grid() allows you to generate a data.frame
**> with all combinations of a set of values, e.g.:
**> > expand.grid(par1=-1:1,par2=c('a','b'))
**> par1 par2
**> 1 -1 a
**> 2 0 a
**> 3 1 a
**> 4 -1 b
**> 5 0 b
**> 6 1 b
**> There is nothing wrong with this nice function except when you have too many
**> combinations to fit in your computer memory, and that was my problem: I
**> wanted to do something for each combination of a set of variants, but this
**> set was to large for storing in memory in a data.frame generated by
**> expand.grid. A possible solution would be to have a set of nested for()
**> cycles but I preferred a solution that involved a single for() cycle going
**> from 1 to the number of combinations and then at each iteration having some
**> form of generating the combination "i". And this was the "real problem": how
**> to generate a function that picks the same style of arguments as
**> expand.grid() and provides me with the values corresponding to line "i" of
**> the data frame that would have been created bu expand.grid(). For instance,
**> if I wanted the line 4 of the above call to expand.grid() I should get the
**> same as doing:
**> > expand.grid(par1=-1:1,par2=c('a','b'))[4,]
**> par1 par2
**> 4 -1 b
**> but obviously without having to use expand.grid() as that involves generating
**> a data frame that in my case wouldn't fit in the memory of my computer.
**> Now, the function I've created was the following:
**> --------------------------------------------
**> getVariant <- function(id,vars) {
**> if (!is.list(vars)) stop('vars needs to be a list!')
**> nv <- length(vars)
**> lims <- sapply(vars,length)
**> if (id > prod(lims)) stop('id above the number of combinations!')
**> res <- vector("list",nv)
**> for(i in nv:2) {
**> f <- prod(lims[1:(i-1)])
**> res[[i]] <- vars[[i]][ceiling(id / f)]
**> id <- id - (ceiling(id/f)-1)*f
**> }
**> res[[1]] <- vars[[1]][id]
**> names(res) <- names(vars)
**> res
**> }
**> --------------------------------------
**> > expand.grid(par1=-1:1,par2=c('a','b'))[4,]
**> par1 par2
**> 4 -1 b
**> > getVariant(4,list(par1=-1:1,par2=c('a','b')))
**> $par1
**> [1] -1
**> $par2
**> [1] "b"
**> I would be glad to know if somebody came across the same problem and has a
**> better suggestion on how to solve this.
A few minor improvements:

1) let id be a vector of indices 2) use %% and %/% instead of ceiling (perhaps debateable) 3) return a data frame as does expand.grid

So your function now looks like:

getVariant <- function(id, vars) {

if (!is.list(vars)) stop('vars needs to be a list!')
nv <- length(vars)

lims <- sapply(vars, length)

if (any(id > prod(lims))) stop('id above the number of combinations!')
res <- vector("list", nv)

for(i in nv:2) {

f <- prod(lims[1:(i-1)])

res[[i]] <- vars[[i]][(id - 1)%/%f + 1]
id <- (id - 1)%%f + 1

}

res[[1]] <- vars[[1]][id]

names(res) <- names(vars)

return(as.data.frame(res))

}

Now, for example, you get:

> expand.grid(par1=-1:1,par2=c('a','b'),par3=c('w','x','y','z'))[12:15,]

par1 par2 par3

12 1 b x

13 -1 a y

14 0 a y

15 1 a y

> getVariant(12:15,list(par1=-1:1,par2=c('a','b'), par3=c('w','x','y','z')))

par1 par2 par3

1 1 b x

2 -1 a y

3 0 a y

4 1 a y

Note that you will run into trouble when the product of the lengths is greater than the largest representable integer on your system.

Hope this helps,

Ray Brownrigg

