[R] Efficient computation of trimmed stats?

From: Benilton Carvalho <bcarvalh_at_jhsph.edu>
Date: Mon, 14 May 2007 12:58:42 -0400

Hi everyone,

I was wondering if there is anything already implemented for efficient ("row-wise") computation of group-specific trimmed stats (mean and sd on the trimmed vector) on large matrices.

For example:

nc = 300
nr = 250000
x = matrix(rnorm(nc*nr), ncol=nc)
g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)

trimmedMeanByGroup <- function(y, grp, trim=.05)

   tapply(y, factor(grp, levels=1:3), mean, trim=trim)

sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))

works fine... but:

> system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g

    user system elapsed
399.928 0.019 399.988

does not look interesting for me.

Maybe some package has some implementation of the above?

Thank you very much,

Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University

R-help_at_stat.math.ethz.ch mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 14 May 2007 - 17:06:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 May 2007 - 13:31:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.