[R] aggregate vs tapply; is there a middle ground?

From: Joseph LeBouton <lebouton_at_msu.edu>
Date: Sun 12 Feb 2006 - 08:28:21 EST

Dear all,

I'm wanting to do a series of comparisons among 4 categorical variables:

a <- aggregate(y, list(var1, var2, var3, var4), sum)

This gets me a very nice 2-dimensional data frame with one column per variable, BUT, as help for aggregate says, <<empty subsets are removed>>. I don't see in help(aggregate) how I can change this.

In contrast,
a <- tapply(y, list(var1, var2, var3, var4), sum)

gives me results for everything including empty subsets, but in an awkward 4-dimensional array that takes me another 10 lines of inefficient code to turn into a 2D data.frame.

Is there a way to directly do this calculation INCLUDING results for empty subsets, and still obtain a 2D array, matrix, or data.frame? OR alternatively is there a simple way to mush the 4D result from the tapply into a 2D matrix/data.frame?

thanks very much in advance for any help!



Joseph P. LeBouton
Forest Ecology PhD Candidate
Department of Forestry
Michigan State University
East Lansing, Michigan 48824

Office phone: 517-355-7744
email: lebouton@msu.edu

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Feb 12 08:31:33 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:27 EST