Re: [R] Problem with Weighted Variance in Hmisc

From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>
Date: Fri, 01 Jun 2007 07:11:50 -0500

jiho wrote:
> On 2007-June-01 , at 01:03 , Tom La Bone wrote:

>> The function wtd.var(x,w) in Hmisc calculates the weighted variance  
>> of x
>> where w are the weights.  It appears to me that wtd.var(x,w) = var 
>> (x) if all
>> of the weights are equal, but this does not appear to be the case. Can
>> someone point out to me where I am going wrong here?  Thanks.

>
> The true formula of weighted variance is this one:
> http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/
> weighvar.pdf
> But for computation purposes, wtd.var uses another definition which
> considers the weights as repeats instead of true weights. However if
> the weights are normalized (sum to one) to two formulas are equal. If
> you consider weights as real weights instead of repeats, I would
> recommend to use this option.
> With normwt=T, your issue is solved:
>
> > a=1:10
> > b=a
> > b[]=2
> > b
> [1] 2 2 2 2 2 2 2 2 2 2
> > wtd.var(a,b)
> [1] 8.68421
> # all weights equal 2 <=> there are two repeats of each element of a
> > var(c(a,a))
> [1] 8.68421
> > wtd.var(a,b,normwt=T)
> [1] 9.166667
> > var(a)
> [1] 9.166667
>
> Cheers,
>
> JiHO

The issue is what is being assumed for N in the denominator of the variance formula, since the unbiased estimator subtracts one. Using normwt=TRUE means you are in effect assuming N is the number of elements in the data vector, ignoring the weights.

Frank Harrell

> ---
> http://jo.irisson.free.fr/
>
> ______________________________________________
> R-help_at_stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 01 Jun 2007 - 12:26:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 01 Jun 2007 - 13:31:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.