[R] aggregate slow with many rows - alternative?

From: Hans-Peter <gchappi_at_gmail.com>
Date: Fri 14 Oct 2005 - 05:14:24 EST


Hi,

I use the code below to aggregate / cnt my test data. It works fine, but the problem is with my real data (33'000 rows) where the function is really slow (nothing happened in half an hour).

Does anybody know of other functions that I could use?

Thanks,
Hans-Peter



dat <- data.frame( Datum = c( 32586, 32587, 32587, 32625, 32656, 32656, 32656, 32672, 32672, 32699 ),

              FischerID = c( 58395, 58395, 58395, 88434, 89953, 89953, 89953, 64395, 62896, 62870 ),

              Anzahl = c( 2, 2, 1, 1, 2, 1, 7, 1, 1, 2 ) ) f <- function(x) data.frame( Datum = x[1,1], FischerID = x[1,2], Anzahl = sum( x[,3] ), Cnt = dim( x )[1] ) t.a <- do.call("rbind", by(dat, dat[,1:2], f)) # slow for 33'000 rows t.a <- t.a[order( t.a[,1], t.a[,2] ),]

  # show data
dat
t.a



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Oct 14 05:31:13 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:40:44 EST