From: Berwin A Turlach <berwin_at_maths.uwa.edu.au>

Date: Fri, 8 Feb 2008 14:08:25 +0800

Date: Fri, 8 Feb 2008 14:08:25 +0800

G'day Brian,

On Thu, 07 Feb 2008 17:56:07 -0500

Brian McGill <brian.mcgill_at_mcgill.ca> wrote:

> I am playing with the a 1-way anova with and without the "-1" option.

*>
**> [...]
**>
**> From what I can tell:
**> 1) the estimated means of the different levels are correctly
**> estimated either way (although reported as means with the -1 and as
**> contrasts without the -1 as expected)
**> 2) the residuals are identical (in this contrived example they differ
**> slightly due to numeric instability but in a more real-world example
**> they truly are identical)
**> 3) BUT the r2/F/p-value are different (in my real-world example they
**> are drastically different)
**>
**> How can a model that gets the same parameter estimates on the same
**> data leading to the same residuals get different r2/F/p-value?
*

In an R session, type "help(summary.lm)" and, under the section "Value", read about the way r.squared is calculated.

Note that if your model specifies a "-1" (or a "+0"), then the model is assumed to have no intercept. Or, rather, that the null model (i.e. the model when all covariates are removed) is a response of 0 + noise.

This should explain the differences that you see.

R does not check whether a vector of 1s is in the column space of the design matrix and, of course, does not base the decision on whether the model has an intercept or not (i.e. whether the null model should be "mu + noise" or "0 + noise) on this not-performed test.

It is a design issue, presumably to be compatible to S and, obviously, it also makes implementation of summary.lm easier, I guess. :)

Occasionally, I use "-1" in the formula to construct a specific design matrix and, hence, get estimates and standard errors for quantities that are of interest. It is a bit annoying to have to refit the model with out the "-1" to get the r2/F that I want, but one has to live with it. I would prefer that the decision on whether the null model has an intercept or not would be based on whether a vector of 1s is in the column space of the design matrix and not on whether the formula has a "-1" on it. But I can also see the argument for the alternative preference.

But I just noticed, if you so wish, you could be subversive:

*> fm1<-lm(y~x-1)
*

> attr(fm1$terms, "intercept") <- 1

> summary(fm1)

Call:

lm(formula = y ~ x - 1)

Residuals:

1 2 3 4 5 6 1.000e-01 3.608e-16 -1.000e-01 -3.469e-18 1.000e-01 -1.000e-01

Coefficients:

Estimate Std. Error t value Pr(>|t|) x1 1.00000 0.05774 17.32 6.52e-05 *** x2 2.00000 0.05774 34.64 4.14e-06 ***

--- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.1 on 4 degrees of freedom Multiple R-Squared: 0.974, Adjusted R-squared: 0.9675 F-statistic: 150 on 1 and 4 DF, p-value: 0.0002552 Hope this helps. Cheers, Berwin =========================== Full address ============================= Berwin A Turlach Tel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability +65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: statba_at_nus.edu.sg Singapore 117546 http://www.stat.nus.edu.sg/~statba ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Fri 08 Feb 2008 - 06:20:41 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 08 Feb 2008 - 08:30:12 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*