From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Fri 15 Sep 2006 - 00:55:29 GMT

Date: Fri 15 Sep 2006 - 00:55:29 GMT

Here are three different ways to do it:

# base R

fb <- function(x)

c(V1 = x$V1[1], V4 = x$V4[1], V2.mean = mean(x$V2), V3.mean = mean(x$V3), n = length(x$V1)) do.call(rbind, by(DF, DF[c(1,4)], fb))

# package doBy

library(doBy)

summaryBy(V2 + V3 ~ V1 + V4, DF, FUN = c(mean, length))[,-5]

# package reshape

library(reshape)

f <- function(x) c(mean = mean(x), n = length(x))
cast(melt(DF, id = c(1,4)), V1 + V4 ~ variable, fun.aggregate = f)[,-6]

*> # base R
**> fb <- function(x)
*

+ c(V1 = x$V1[1], V4 = x$V4[1], V2.mean = mean(x$V2),
+ V3.mean = mean(x$V3), n = length(x$V1))

*> do.call(rbind, by(DF, DF[c(1,4)], fb))
*

V1 V4 V2.mean V3.mean n [1,] 1 1 2.0 400 3 [2,] 3 1 5.0 70 1 [3,] 2 2 0.7 35 2

V1 V4 mean.V2 mean.V3 length.V3

1 A ID1 2.0 400 3 2 C ID1 5.0 70 1 3 B ID2 0.7 35 2

V1 V4 V2_mean V2_n V3_mean

1 A ID1 2.0 3 400 2 B ID2 0.7 2 35 3 C ID1 5.0 1 70

> Thanks Gabor, that is much faster than using a loop!

