From: Duncan Murdoch <murdoch_at_stats.uwo.ca>

Date: Mon 05 Dec 2005 - 19:37:50 GMT

*>
*

> If people need to do this, such an option would be a convenience,

*> but I don't see that it has much further merit than that.
*

*>
*

*> My view of how to calculate a "variance" is based, not directly
*

*> on the the "unbiased" issue, but on the following.
*

*>
*

*> Suppose you define a RV X as a single value sampled from a finite
*

*> population of values X1,...,XN.
*

*>
*

*> The variance of X is (or damn well should be) defined as
*

*>
*

*> Var(X) = E(X^2) - (E(X))^2
*

*>
*

*> and this comes to (Sum(X^2) - (Sum(X)/N)^2))/(N-1).
*

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Dec 06 06:42:31 2005

Date: Mon 05 Dec 2005 - 19:37:50 GMT

On 12/5/2005 2:25 PM, (Ted Harding) wrote:

> On 05-Dec-05 Martin Maechler wrote:

>> UweL> x <- c(1,2,3,4,5) >> UweL> n <- length(x) >> UweL> var(x)*(n-1)/n >> >> UweL> if you really want it. >> >> It seems Insightful at some point in time have given in to >> this user request, and S-plus nowadays has >> an argument "unbiased = TRUE" >> where the user can choose {to shoot (him/her)self in the leg and} >> require 'unbiased = FALSE'. >> {and there's also 'SumSquraes = FALSE' which allows to not >> require any division (by N or N-1)} >> >> Since in some ``schools of statistics'' people are really still >> taught to use a 1/N variance, we could envisage to provide such an >> argument to var() {and cov()} as well. Otherwise, people define >> their own variance function such as >> VAR <- function(x,....) .. N/(N-1)*var(x,...) >> Should we?

> If people need to do this, such an option would be a convenience,

I don't follow this. I agree with the first line (though I prefer to write it differently), but I don't see how it leads to the second. For example, consider a distribution which is equally likely to be +/- 1, and a sample from it consisting of a single 1 and a single -1. The first formula gives 1 (which is the variance), the second gives 2.

The second formula is unbiased because in a random sample I am just as likely to get a 0 from the second formula, but I'm curious about what you mean by "this comes to".

Duncan

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Dec 06 06:42:31 2005

*
This archive was generated by hypermail 2.1.8
: Mon 20 Feb 2006 - 03:21:34 GMT
*