From: John Fox <jfox_at_mcmaster.ca>

Date: Tue 29 Aug 2006 - 08:07:37 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Aug 29 09:26:19 2006

Date: Tue 29 Aug 2006 - 08:07:37 EST

Dear Amasco,

Again, I'll answer briefly (since the written source that I previously mentioned has an extensive discussion):

> -----Original Message-----

*> From: r-help-bounces@stat.math.ethz.ch
**> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Amasco
**> Miralisus
**> Sent: Monday, August 28, 2006 2:21 PM
**> To: r-help@stat.math.ethz.ch
**> Cc: John Fox; Prof Brian Ripley; Mark Lyman
**> Subject: Re: [R] Type II and III sum of square in Anova (R,
**> car package)
**>
**> Hello,
**>
**> First of all, I would like to thank everybody who answered my
**> question. Every post has added something to my knowledge of the topic.
**> I now know why Type III SS are so questionable.
**>
**> As I understood form R FAQ, there is disagreement among
**> Statisticians which SS to use
**> (**http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-out
**> put-from-anova_0028_0029-depend-on-the-order-of-factors-in-the
**> -model_003f).
**> However, most commercial statistical packages use Type III as
**> the default (with orthogonal contrasts), just as STATISTICA,
**> from which I am currently trying to migrate to R. This was
**> probably was done for the convenience of end-users who are
**> not very experienced in theoretical statistics.
**>
*

Note that the contrasts are only orthogonal in the row basis of the model matrix, not, with unbalanced data, in the model matrix itself.

> I am aware that the same result could be produced using the standard

*> anova() function with Type I "sequential" SS, supplemented by
**> drop1() function, but this approach will look quite
**> complicated for persons without any substantial background in
**> statistics, like no-math students. I would prefer easier way,
**> possibly more universal, though also probably more "for
**> dummies" :) If am not mistaken, car package by John Fox with
**> his nice Anova() function is the reasonable alternative for
**> any, who wish to simply perform quick statistical analysis,
**> without afraid to mess something with model fitting. Of
**> course orthogonal contrasts have to be specified (for example
**> contr.sum) in case of Type III SS.
**>
**> Therefore, I would like to reformulate my questions, to make
**> it easier for you to answer:
**>
**> 1. The first question related to answer by Professor Brian
**> Ripley: Did I understood correctly from the advised paper
**> (Bill Venables'
**> 'exegeses' paper) that there is not much sense to test main
**> effects if the interaction is significant?
**>
*

Many are of this opinion. I would put it a bit differently: Properly formulated, tests of main effects in the presence of interactions make sense (i.e., have a straightforward interpretation in terms of population marginal means) but probably are not of interest.

> 2. If I understood the post by John Fox correctly, I could safely use

*> Anova(.,type="III") function from car for ANOVA analyses in
**> R, both for balanced and unbalanced designs? Of course
**> providing the model was fitted with orthogonal contrasts.
**> Something like below:
**> mod <- aov(response ~ factor1 * factor2, data=mydata,
**> contrasts=list(factor1=contr.sum,
**> factor2=contr.sum)) Anova(mod, type="III")
**>
*

Yes (or you could reset the contrasts option), but why do you appear to prefer the "type-III" tests to the "type-II" tests?

> It was also said in most of your posts that the decision of

*> which of Type of SS to use has to be done on the basis of the
**> hypothesis we want to test. Therefore, let's assume that I
**> would like to test the significance of both factors, and if
**> some of them significant, I plan to use post-hoc tests to
**> explore difference(s) between levels of this significant factor(s).
**>
*

Your statement is too vague to imply what kind of tests you should use. I think that people are almost always interested in "main effects" when interactions to which they are marginal are negligible. In this situation, both "type-II" and "type-III" tests are appropriate, and "type-II" tests would usually be more powerful.

Regards,

John

> Thank you in advance, Amasco

*>
**> On 8/27/06, John Fox <jfox@mcmaster.ca> wrote:
**> > Dear Amasco,
**> >
**> > A complete explanation of the issues that you raise is
**> awkward in an
**> > email, so I'll address your questions briefly. Section 8.2
**> of my text,
**> > Applied Regression Analysis, Linear Models, and Related
**> Methods (Sage,
**> > 1997) has a detailed discussion.
**> >
**> > (1) In balanced designs, so-called "Type I," "II," and
**> "III" sums of
**> > squares are identical. If the STATA manual says that Type
**> II tests are
**> > only appropriate in balanced designs, then that doesn't
**> make a whole
**> > lot of sense (unless one believes that Type-II tests are nonsense,
**> > which is not the case).
**> >
**> > (2) One should concentrate not directly on different
**> "types" of sums
**> > of squares, but on the hypotheses to be tested. Sums of squares and
**> > F-tests should follow from the hypotheses. Type-II and
**> Type-III tests
**> > (if the latter are properly formulated) test hypotheses that are
**> > reasonably construed as tests of main effects and interactions in
**> > unbalanced designs. In unbalanced designs, Type-I sums of squares
**> > usually test hypotheses of interest only by accident.
**> >
**> > (3) Type-II sums of squares are constructed obeying the
**> principle of
**> > marginality, so the kinds of contrasts employed to
**> represent factors
**> > are irrelevant to the sums of squares produced. You get the same
**> > answer for any full set of contrasts for each factor. In
**> general, the
**> > hypotheses tested assume that terms to which a particular term is
**> > marginal are zero. So, for example, in a three-way ANOVA
**> with factors
**> > A, B, and C, the Type-II test for the AB interaction
**> assumes that the
**> > ABC interaction is absent, and the test for the A main
**> effect assumes
**> > that the ABC, AB, and AC interaction are absent (but not
**> necessarily
**> > the BC interaction, since the A main effect is not marginal to this
**> > term). A general justification is that we're usually not
**> interested,
**> > e.g., in a main effect that's marginal to a nonzero interaction.
**> >
**> > (4) Type-III tests do not assume that terms higher-order to
**> the term
**> > in question are zero. For example, in a two-way design with
**> factors A
**> > and B, the type-III test for the A main effect tests whether the
**> > population marginal means at the levels of A (i.e., averaged across
**> > the levels of B) are the same. One can test this hypothesis
**> whether or
**> > not A and B interact, since the marginal means can be
**> formed whether
**> > or not the profiles of means for A within levels of B are parallel.
**> > Whether the hypothesis is of interest in the presence of
**> interaction
**> > is another matter, however. To compute Type-III tests using
**> > incremental F-tests, one needs contrasts that are orthogonal in the
**> > row-basis of the model matrix. In R, this means, e.g., using
**> > contr.sum, contr.helmert, or contr.poly (all of which will give you
**> > the same SS), but not contr.treatment. Failing to be
**> careful here will
**> > result in testing hypotheses that are not reasonably
**> construed, e.g., as hypotheses concerning main effects.
**> >
**> > (5) The same considerations apply to linear models that include
**> > quantitative predictors -- e.g., ANCOVA. Most software will not
**> > automatically produce sensible Type-III tests, however.
**> >
**> > I hope this helps,
**> > John
**> >
**> > --------------------------------
**> > John Fox
**> > Department of Sociology
**> > McMaster University
**> > Hamilton, Ontario
**> > Canada L8S 4M4
**> > 905-525-9140x23604
**> > http://socserv.mcmaster.ca/jfox
**> > --------------------------------
**> >
**> > > -----Original Message-----
**> > > From: r-help-bounces@stat.math.ethz.ch
**> > > [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Amasco
**> > > Miralisus
**> > > Sent: Saturday, August 26, 2006 5:07 PM
**> > > To: r-help@stat.math.ethz.ch
**> > > Subject: [R] Type II and III sum of square in Anova (R,
**> car package)
**> > >
**> > > Hello everybody,
**> > >
**> > > I have some questions on ANOVA in general and on ANOVA in R
**> > > particularly.
**> > > I am not Statistician, therefore I would be very
**> appreciated if you
**> > > answer it in a simple way.
**> > >
**> > > 1. First of all, more general question. Standard anova() function
**> > > for lm() or aov() models in R implements Type I sum of squares
**> > > (sequential), which is not well suited for unbalanced ANOVA.
**> > > Therefore it is better to use
**> > > Anova() function from car package, which was programmed
**> by John Fox
**> > > to use Type II and Type III sum of squares. Did I get the point?
**> > >
**> > > 2. Now more specific question. Type II sum of squares is not well
**> > > suited for unbalanced ANOVA designs too (as stated in STATISTICA
**> > > help), therefore the general rule of thumb is to use Anova()
**> > > function using Type II SS only for balanced ANOVA and Anova()
**> > > function using Type III SS for unbalanced ANOVA?
**> > > Is this correct interpretation?
**> > >
**> > > 3. I have found a post from John Fox in which he wrote
**> that Type III
**> > > SS could be misleading in case someone use some
**> contrasts. What is
**> > > this about?
**> > > Could you please advice, when it is appropriate to use
**> Type II and
**> > > when Type III SS? I do not use contrasts for comparisons, just
**> > > general ANOVA with subsequent Tukey post-hoc comparisons.
**> > >
**> > > Thank you in advance,
**> > > Amasco
**> > >
**> > > [[alternative HTML version deleted]]
**> > >
**> > > ______________________________________________
**> > > R-help@stat.math.ethz.ch mailing list
**> > > https://stat.ethz.ch/mailman/listinfo/r-help
**> > > PLEASE do read the posting guide
**> > > http://www.R-project.org/posting-guide.html
**> > > and provide commented, minimal, self-contained, reproducible code.
**> >
**> >
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Aug 29 09:26:19 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 29 Aug 2006 - 10:25:18 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*