From: Douglas Bates <bates_at_stat.wisc.edu>

Date: Thu 21 Apr 2005 - 00:06:32 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Apr 21 00:15:39 2005

Date: Thu 21 Apr 2005 - 00:06:32 EST

michael watson (IAH-C) wrote:

*> Hi
**>
*

> I am performing an analysis of variance with two factors, each with two

*> levels. I have differing numbers of observations in each of the four
**> combinations, but all four combinations *are* present (2 of the factor
**> combinations have 3 observations, 1 has 4 and 1 has 5)
**>
**> I have used both anova(aov(...)) and anova(lm(...)) in R and it gave the
**> same result - as expected. I then plugged this into minitab, performed
**> what minitab called a General Linear Model (I have to use this in
**> minitab as I have an unbalanced data set) and got a different result.
**> After a little mining this is because minitab, by default, uses the type
**> III adjusted SS. Sure enough, if I changed minitab to use the type I
**> sequential SS, I get exactly the same results as aov() and lm() in R.
**>
**> So which should I use? Type I adjusted SS or Type III sequential SS?
**> Minitab help tells me that I would "usually" want to use type III
**> adjusted SS, as type I sequential "sums of squares can differ when your
**> design is unbalanced" - which mine is. The R functions I am using are
**> clearly using the type I sequential SS.
*

Install the fortunes package and try

> fortune("Venables")

I'm really curious to know why the "two types" of sum of squares are called
"Type I" and "Type III"! This is a very common misconception, particularly
among SAS users who have been fed this nonsense quite often for all their
professional lives. Fortunately the reality is much simpler. There is,
by any

sensible reckoning, only ONE type of sum of squares, and it always
represents

an improvement sum of squares of the outer (or alternative) model over the
inner (or null hypothesis) model. What the SAS highly dubious
classification of

sums of squares does is to encourage users to concentrate on the null
hypothesis model and to forget about the alternative. This is always a
very bad

idea and not surprisingly it can lead to nonsensical tests, as in the
test it

provides for main effects "even in the presence of interactions", something
which beggars definition, let alone belief.

- Bill Venables R-help (November 2000)

In the words of the master, "there is ... only one type of sum of squares", which is the one that R reports. The others are awkward fictions created for times when one could only afford to fit one or two linear models per week and therefore wanted the output to give results for all possible tests one could conceive, even if the models being tested didn't make sense.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Apr 21 00:15:39 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:31:17 EST
*