Re: [R] aggregate vs tapply; is there a middle ground?

From: hadley wickham <h.wickham_at_gmail.com>
Date: Sun 12 Feb 2006 - 09:44:53 EST

> I faced a similar problem. Here's what I did
>
> tmp <-
> data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10))
> tmp1 <- with(tmp,aggregate(C,list(A=A,B=B),sum))
> tmp2 <- expand.grid(A=sort(unique(tmp$A)),B=sort(unique(tmp$B)))
> merge(tmp2,tmp1,all.x=T)
>
> At least fewer than 10 extra lines of code. Anyone with a simpler solution?

Well, you can almost do this in with the reshape package:

tmp <-
data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10)) a <- recast(tmp, A + B ~ ., sum)
# see also recast(tmp, A ~ B, sum)
add.all.combinations(a, row="A", cols = "B")

Where add.all.combinations basically does what you outlined above -- it would be easy enough to generalise to multiple dimensions.

Hadley



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Feb 12 10:01:04 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:27 EST