[R] Efficient computation of trimmed stats?

From: Benilton Carvalho <bcarvalh_at_jhsph.edu>
Date: Mon, 14 May 2007 12:58:42 -0400


Hi everyone,

I was wondering if there is anything already implemented for efficient ("row-wise") computation of group-specific trimmed stats (mean and sd on the trimmed vector) on large matrices.

For example:

set.seed(1)
nc = 300
nr = 250000
x = matrix(rnorm(nc*nr), ncol=nc)
g = matrix(sample(1:3, nr*nc, rep=T), ncol=nc)

trimmedMeanByGroup <- function(y, grp, trim=.05)

   tapply(y, factor(grp, levels=1:3), mean, trim=trim)

sapply(1:10, function(i) trimmedMeanByGroup(x[i,], g[i,]))

works fine... but:

> system.time(sapply(1:nr, function(i) trimmedMeanByGroup(x[i,], g
[i,])))

    user system elapsed
399.928 0.019 399.988

does not look interesting for me.

Maybe some package has some implementation of the above?

Thank you very much,
-b

--
Benilton Carvalho
PhD Candidate
Department of Biostatistics
Bloomberg School of Public Health
Johns Hopkins University
bcarvalh_at_jhsph.edu

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 14 May 2007 - 17:06:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 May 2007 - 13:31:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.