# Re: [R] computing the variance

From: Kristel Joossens <kristel.joossens_at_econ.kuleuven.be>
Date: Mon 05 Dec 2005 - 20:32:20 EST

Just redefine the var(x) as sum((x-mean(x))^2)/length(x)? Or straightforward just use var(x)*(1-1/length(x))

As you already mentioned var(x) is now defined by sum((x-mean(x))^2)/(length(x)-1) which is an *unbaised* estimtor of COV. While sum((x-mean(x))^2)/length(x) is a *biased* estimator with Bias = -1/N COV

Denote population mean by M
Proof: E[sum (Xj-mean(X))^2] = E[sum Xj^2 - n mean(X)^2]

• sum E[Xj^2] - n E[mean(X)^2]
• sum (COV + M^2) - n (1/n COV + M^2)
• (n-1) COV

Best regards,
Kristel

Wang Tian Hua wrote:
> hi,
> when i was computing the variance of a simple vector, i found unexpect
> result. not sure whether it is a bug.
> > var(c(1,2,3))
>  1 #which should be 2/3.
> > var(c(1,2,3,4,5))
>  2.5 #which should be 10/5=2
>
> it seems to me that the program uses (sample size -1) instead of sample
> size at the denominator. how can i rectify this?
>
> regards,
> tianhua
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help

```--
__________________________________________
Kristel Joossens        Ph.D. Student
Research Center ORSTAT  K.U. Leuven
Naamsestraat 69         Tel: +32 16 326929
3000 Leuven, Belgium    Fax: +32 16 326732
E-mail:  Kristel.Joossens@econ.kuleuven.be
http://www.econ.kuleuven.be/public/ndbae49

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help