[R] Faster alternative to by?

From: michael watson (IAH-C) <michael.watson_at_bbsrc.ac.uk>
Date: Wed 26 Jul 2006 - 22:41:31 EST


I have a data.frame, two columns, 12304 rows. Both columns are factors. I want to do an equivalent of an SQL "group by" statement, and count the number of rows in the data frame for each unique value of the second column.

I have:

countl <- by(mapped, mapped$col2, nrow)

Now, mapped$col2 has 10588 levels, so this statement takes a really long time to run. Is there a more efficient way of doing this in R?



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jul 26 22:52:54 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 27 Jul 2006 - 00:16:52 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.