Re: [R] Weighted variance function?

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Thu, 24 Jul 2008 15:27:15 +0100

On Thu, 2008-07-24 at 02:25 +0530, Arun Kumar Saha wrote:
> There is a R function to calculate weighted mean : weighted.mean() under
> stats package. Is there any direct R function for calculating weighted
> variance as well?

Here are two ways; weighted.var() is via the usual formula and weighted.var2() uses a running sums approach. The formulae for which are both on the weighted mean entry page on wikipedia for example.

The removal of NA is as per weighted.mean, but I have not included any of the sanity checks that that functions contains.

weighted.var <- function(x, w, na.rm = FALSE) {

    if (na.rm) {

        w <- w[i <- !is.na(x)]
        x <- x[i]

    }
    sum.w <- sum(w)
    sum.w2 <- sum(w^2)
    mean.w <- sum(x * w) / sum(w)
    (sum.w / (sum.w^2 - sum.w2)) * sum(w * (x - mean.w)^2, na.rm = na.rm)
}

weighted.var2 <- function(x, w, na.rm = FALSE) {

    if (na.rm) {

        w <- w[i <- !is.na(x)]
        x <- x[i]

    }
    sum.w <- sum(w)
    (sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)) }

## from example section in ?weighted.mean ## GPA from Siegel 1994
wt <- c(5, 5, 4, 1)/15
x <- c(3.7,3.3,3.5,2.8)
weighted.mean(x,wt)

weighted.var(x, wt)

weighted.var2(x, wt)

And some timings:

> system.time(replicate(100000, weighted.var(x, wt)))

   user system elapsed
  2.679 0.014 2.820
> system.time(replicate(100000, weighted.var2(x, wt)))

   user system elapsed
  2.224 0.010 2.315

HTH G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 24 Jul 2008 - 14:58:38 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 27 Jul 2008 - 08:32:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive