Re: [R] Unbalanced Anova: What is the best approach?

From: Krishna Kirti Das <>
Date: Sun, 03 Apr 2011 08:35:30 -0600

Thank you, John.

Yes, your answers do help. For me it's mainly about getting familiar with the "R" way of doing things.

Thus your response also confirms what I suspected, that there is no explicit user-interface (at least one that is widely used) in terms of functions/packages that represents an unbalanced design in the same way that aov would represent a balanced one. Analyzing balanced and unbalanced data are obviously possible, but with balanced designs via aov what has to be done is intuitive within the language but unintuitive for unbalanced designs.

I did notice that this question gets asked several times and in slightly different ways, and I think the lack of an interface that represents an unbalanced design in the same way aov represents balanced designs is why the question will probably keep getting asked again.

I had mentioned nlme and lme4 because I saw in some of the discussions that using those were recommended for working with unbalanced designs. And specifying random effects with zero variance, for example, would probably serve my purposes.

Thank you for your help.



On Sun, Apr 3, 2011 at 7:28 AM, John Fox <> wrote:

> Dear Krishna,
> Although it's difficult to explain briefly, I'd argue that balanced and
> unbalanced ANOVA are not fundamentally different, in that the focus should
> be on the hypotheses that are tested, and these are naturally expressed as
> functions of cell means and marginal means. For example, in a two-way
> the null hypotheses of no interaction is equivalent to parallel profiles of
> cell means for one factor across levels of the other. What is different,
> though, is that in a balanced ANOVA all common approaches to constructing
> an
> ANOVA table coincide.
> Without getting into the explanation in detail (which you can find in a
> text
> like my Applied Regression Analysis and Generalized Linear Models),
> so-called type-I (or sequential) tests, such as those performed by the
> standard anova() function in R, test hypotheses that are rarely of
> substantive interest, and, even when they are, are of interest only by
> accident. So-called type-II tests, such as those performed by default by
> the
> Anova() function in the car package, test hypotheses that are almost always
> of interest. Type-III tests, which the Anova() function in car can perform
> optionally, require careful formulation of the model for the hypotheses
> tested to be sensible, and even then have less power than corresponding
> type-II tests in the circumstances in which a test would be of interest.
> Since you're addressing fixed-effects models, I'm not sure why you
> introduced nlme and lme4 into the discussion, but I note that Anova() in
> the
> car package has methods that can produce type-II and -III Wald tests for
> the
> fixed effects in mixed models fit by lme() and lmer().
> Your question has been asked several times before on the r-help list. For
> example, if you enter terms like "type-II" or "unbalanced ANOVA" in the
> RSeek search engine and look under the "Support Lists" tab, you'll see many
> hits -- e.g.,
> <M>.
> I hope this helps,
> John
> --------------------------------
> John Fox
> Senator William McMaster
> Professor of Social Statistics
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> > -----Original Message-----
> > From: []
> > On Behalf Of Krishna Kirti Das
> > Sent: April-03-11 3:25 AM
> > To:
> > Subject: [R] Unbalanced Anova: What is the best approach?
> >
> > I have a three-way unbalanced ANOVA that I need to calculate (fixed
> > effects plus interactions, no random effects). But word has it that aov()
> > is good only for balanced designs. I have seen a number of different
> > recommendations for working with unbalanced designs, but they seem to
> > differ widely (car, nlme, lme4, etc.). So I would like to know what is
> the
> > best or most usual way to go about working with unbalanced designs and
> > extracting a reliable ANOVA table from them in R?
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > mailing list
> >
> > PLEASE do read the posting guide
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]] mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Sun 03 Apr 2011 - 14:53:56 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 03 Apr 2011 - 16:00:26 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive