# Re: [R] Don't understand removing constant on 1-way ANOVA

From: Berwin A Turlach <berwin_at_maths.uwa.edu.au>
Date: Fri, 8 Feb 2008 14:08:25 +0800

G'day Brian,

On Thu, 07 Feb 2008 17:56:07 -0500
Brian McGill <brian.mcgill_at_mcgill.ca> wrote:

> I am playing with the a 1-way anova with and without the "-1" option.
>
> [...]
>
> From what I can tell:
> 1) the estimated means of the different levels are correctly
> estimated either way (although reported as means with the -1 and as
> contrasts without the -1 as expected)
> 2) the residuals are identical (in this contrived example they differ
> slightly due to numeric instability but in a more real-world example
> they truly are identical)
> 3) BUT the r2/F/p-value are different (in my real-world example they
> are drastically different)
>
> How can a model that gets the same parameter estimates on the same
> data leading to the same residuals get different r2/F/p-value?

In an R session, type "help(summary.lm)" and, under the section "Value", read about the way r.squared is calculated.

Note that if your model specifies a "-1" (or a "+0"), then the model is assumed to have no intercept. Or, rather, that the null model (i.e. the model when all covariates are removed) is a response of 0 + noise.

This should explain the differences that you see.

R does not check whether a vector of 1s is in the column space of the design matrix and, of course, does not base the decision on whether the model has an intercept or not (i.e. whether the null model should be "mu + noise" or "0 + noise) on this not-performed test.

It is a design issue, presumably to be compatible to S and, obviously, it also makes implementation of summary.lm easier, I guess. :)

Occasionally, I use "-1" in the formula to construct a specific design matrix and, hence, get estimates and standard errors for quantities that are of interest. It is a bit annoying to have to refit the model with out the "-1" to get the r2/F that I want, but one has to live with it. I would prefer that the decision on whether the null model has an intercept or not would be based on whether a vector of 1s is in the column space of the design matrix and not on whether the formula has a "-1" on it. But I can also see the argument for the alternative preference.

But I just noticed, if you so wish, you could be subversive:

> fm1<-lm(y~x-1)
> attr(fm1\$terms, "intercept") <- 1
> summary(fm1)

Call:
lm(formula = y ~ x - 1)

Residuals:

1 2 3 4 5 6  1.000e-01 3.608e-16 -1.000e-01 -3.469e-18 1.000e-01 -1.000e-01

Coefficients:

Estimate Std. Error t value Pr(>|t|) x1 1.00000 0.05774 17.32 6.52e-05 *** x2 2.00000 0.05774 34.64 4.14e-06 ***

```---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1 on 4 degrees of freedom
Multiple R-Squared: 0.974,	Adjusted R-squared: 0.9675
F-statistic:   150 on 1 and 4 DF,  p-value: 0.0002552

Hope this helps.

Cheers,

Berwin

Berwin A Turlach                            Tel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability        +65 6516 6650 (self)
Faculty of Science                          FAX : +65 6872 3919
National University of Singapore
6 Science Drive 2, Blk S16, Level 7          e-mail: statba_at_nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help