Re: [R] aggregate vs tapply; is there a middle ground?

From: Hans Gardfjell <hans.gardfjell_at_emg.umu.se>
Date: Mon 13 Feb 2006 - 18:52:08 EST

Thanks Peter!

I had a "feeling" that there must be a simpler, better, more elegant solution.

/Hans

Peter Dalgaard wrote:
> hadley wickham <h.wickham@gmail.com> writes:
>
>
>>> I faced a similar problem. Here's what I did
>>>
>>> tmp <-
>>> data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10))
>>> tmp1 <- with(tmp,aggregate(C,list(A=A,B=B),sum))
>>> tmp2 <- expand.grid(A=sort(unique(tmp$A)),B=sort(unique(tmp$B)))
>>> merge(tmp2,tmp1,all.x=T)
>>>
>>> At least fewer than 10 extra lines of code. Anyone with a simpler solution?
>>>
>> Well, you can almost do this in with the reshape package:
>>
>> tmp <-
>> data.frame(A=sample(LETTERS[1:5],10,replace=T),B=sample(letters[1:5],10,replace=T),C=rnorm(10))
>> a <- recast(tmp, A + B ~ ., sum)
>> # see also recast(tmp, A ~ B, sum)
>> add.all.combinations(a, row="A", cols = "B")
>>
>> Where add.all.combinations basically does what you outlined above --
>> it would be easy enough to generalise to multiple dimensions.
>>
>
> Anything wrong with
>
>
>> as.data.frame(with(tmp,as.table(tapply(C,list(A=A,B=B),sum))))
>>
> A B Freq
> 1 A a NA
> 2 B a -0.2524320
> 3 C a 3.8539264
> 4 D a NA
> 5 A c 0.7227294
> 6 B c -0.2694669
> 7 C c 0.4760957
> 8 D c NA
> 9 A e NA
> 10 B e 0.1800500
> 11 C e NA
> 12 D e -1.0350928
>
> (except the silly colname, responseName="sum" should fix that).
>
>

-- 

*********************************
Hans Gardfjell
Ecology and Environmental Science
Umeň University
90187 Umeň, Sweden
email: hans.gardfjell@emg.umu.se
phone:  +46 907865267
mobile: +46 705984464

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Mon Feb 13 18:59:19 2006

This archive was generated by hypermail 2.1.8 : Tue 14 Feb 2006 - 01:09:39 EST