[R] Sweep statistics

From: Doran, Harold <HDoran_at_air.org>
Date: Tue 10 May 2005 - 01:37:37 EST


Dear List:

I am wondering if there is a more efficient way to compute the following. For the example I am using the star data frame in the mlmRev package. This has 80 schools and includes grades K, 1, 2, and 3. First I compute the grade level mean in each school using tapply as:

tapply(star$math, list(star$sch,star$gr), mean, na.rm=T)

This results in a table of means by school for each grade. Now, I want to add additional columns that include the grand mean of all schools
(for each grade) excluding the school in that column. So, in row 1
(school 1) there would be the grand mean across all other schools 79
except for school 1 for each grade. The second row would contain the means for all schools except for school 2 and so on.

Now, I have this working via a loop that chunks up the data and returns some vectors which I then add to the dataframe. However, this takes a while and I think it may not be efficient.

The sweep function seems as though it might be one avenue as it is designed to sweep out a statistic. Or, is there another method that might be more effective than subsetting multiple dataframes inside a loop.

I am having trouble conceiving how this might be accomplished without having to construct a loop.

Thank you for any thoughts,
Harold

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 10 01:48:15 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:40 EST