Date: Tue, 08 May 2007 11:19:16 -0400

To complicate matters, this terminology isn't entirely standard.

**> I'm struggling with weighted least squares, where something
**> that I had assumed to be true appears not to be the case.
**> Take the following data set as an example:
**> df <- data.frame(x = runif(100, 0, 100)) df$y <- df$x + 1 +
**> rnorm(100, sd=15)
**>
**> I had expected that:
**>
**> summary(lm(y ~ x, data=df, weights=rep(2, 100))) summary(lm(y
**> ~ x, data=rbind(df,df)))
**>
**> would be equivalent, but they are not. I suspect the
**> difference is how the degrees of freedom is calculated - I
**> had expected it to be sum(weights), but seems to be
**> sum(weights > 0). This seems unintuitive to me:
**>
**> summary(lm(y ~ x, data=df, weights=rep(c(0,2), each=50)))
**> summary(lm(y ~ x, data=df, weights=rep(c(0.01,2), each=50)))
**>
**> What am I missing? And what is the usual way to do a linear
**> regression when you have aggregated data?
**>
**> Hadley
