[R] Re: R-help Digest, Vol 18, Issue 12

From: John Maindonald <john.maindonald_at_anu.edu.au>
Date: Thu 12 Aug 2004 - 22:49:31 EST

The message for aov1 was "Estimated effects <may> be unbalanced". The effects are not unbalanced. The design is 'orthogonal'.

The problem is that there are not enough degrees of freedom to estimate all those error terms. If you change the model to:

   aov1 <-
aov(RT~fact1*fact2*fact3+Error(sub/(fact1+fact2+fact3)),data=myData)

or to

   aov2 <-
aov(RT~fact1*fact2*fact3+Error(sub/
((fact1+fact2+fact3)^2)),data=myData)

all is well. This last model (aov2) seems to me to have an excessive number of error terms.

The lme model lme(RT~fact1*fact2*fact3, random=~1|sub, data=myData) is equivalent to aov0 <- aov(RT~fact1*fact2*fact3+Error(sub), data=myData)
It can be verified that the estimated variance components can be used to reproduce the mean squares in the anova table.

A conservative approach is to estimate e.g. contrasts involving fact1 for each subject separately, then comparing these against SE estimates that have 4df (5 subjects -1). This is the ultimate check -- are claimed effects consistent against the 5 subjects? The standard errors should though, probably be based on some variety of averaged variance.

I do not know how to set up the equivalents of aov1 and aov2 (where the errors are indeed crossed) in lme4.

John Maindonald.

On 12 Aug 2004, at 8:03 PM, r-help-request@stat.math.ethz.ch wrote:

> I will follow the suggestion of John Maindonald and present the
> problem by example with some data.
>
> I also follow the advice to use mean scores, somewhat reluctantly
> though. I know it is common practice in psychology, but wouldn’t it be
> more elegant if one could use all the data points in an analysis?
>
> The data for 5 subjects (myData) are provided at the bottom of this
> message. It is a crossed within-subject design with three factors and
> reaction time (RT) as the dependent variable.
>
> An initial repeated-measures model would be:
> aov1<-aov(RT~fact1*fact2*fact3+Error(sub/
> (fact1*fact2*fact3)),data=myData)
>
> Aov complains that the effects involving fact3 are unbalanced:
> > aov1
> …
> Stratum 4: sub:fact3
> Terms:
> fact3 Residuals
> Sum of Squares 4.81639e-07 5.08419e-08
> Deg. of Freedom 2 8
>
> Residual standard error: 7.971972e-05
>
> 6 out of 8 effects not estimable
> Estimated effects may be unbalanced
> …
> Presumably this is because fact3 has three levels and the other ones
> only two, making this a ‘non-orthogonal’ design.
>
> I then fit an equivalent mixed-effects model:
> lme1<-lme(RT~fact1*fact2*fact3,data=meanData2,random=~1|sub)
>
> Subsequently I test if my factors had any effect on RT:
> > anova(lme1)
> numDF denDF F-value p-value
>
> (Intercept) 1 44 105294024 <.0001
> fact1 1 44 22 <.0001
> fact2 1 44 7 0.0090
> fact3 2 44 19 <.0001
> fact1:fact2 1 44 9 0.0047
> fact1:fact3 2 44 1 0.4436
> fact2:fact3 2 44 1 0.2458
> fact1:fact2:fact3 2 44 0 0.6660
>
> Firstly, why are the F-values in the output whole numbers?
>
> The effects are estimated over the whole range of the dataset and this
> is so because all three factors are nested under subjects, on the same
> level. This, however, seems to inflate the F-values compared to the
> anova(aov1) output, e.g.
> > anova(aov1)
> …
> Error: sub:fact2
> Df Sum Sq Mean Sq F value Pr(>F)
> fact2 1 9.2289e-08 9.2289e-08 2.2524 0.2078
> Residuals 4 1.6390e-07 4.0974e-08
> …
>
> I realize that the (probably faulty) aov model may not be directly
> compared to the lme model, but my concern is if the lme estimation of
> the effects is right, and if so, how can a naďve skeptic be convinced
> of this?
>
> The suggestion to use simulate.lme() to find this out seems good, but
> I can’t seem to get it working (from "[R] lme: simulate.lme in R" it
> seems it may never work in R).
>
> I have also followed the suggestion to fit the exact same model with
> lme4. However, format of the anova output does not give me the
> estimation in the way nlme does. More importantly, the degrees of
> freedom in the denominator don’t change, they’re still large:
> > library(lme4)
> > lme4_1<-lme(RT~fact1*fact2*fact3,random=~1|sub,data=myData)
> > anova(lme4_1)
> Analysis of Variance Table
>
> Df Sum Sq Mean Sq Denom F value Pr(>F)
> fact1I 1 2.709e-07 2.709e-07 48 21.9205 2.360e-05
> ***
> fact2I 1 9.229e-08 9.229e-08 48 7.4665 0.008772 **
> fact3L 1 4.906e-08 4.906e-08 48 3.9691 0.052047 .
> fact3M 1 4.326e-07 4.326e-07 48 34.9972 3.370e-07 ***
> fact1I:fact2I 1 1.095e-07 1.095e-07 48 8.8619 0.004552 **
> fact1I:fact3L 1 8.988e-10 8.988e-10 48 0.0727 0.788577
> fact1I:fact3M 1 1.957e-08 1.957e-08 48 1.5834 0.214351
> fact2I:fact3L 1 3.741e-09 3.741e-09 48 0.3027 0.584749
> fact2I:fact3M 1 3.207e-08 3.207e-08 48 2.5949 0.113767
> fact1I:fact2I:fact3L 1 2.785e-09 2.785e-09 48 0.2253 0.637162
> fact1I:fact2I:fact3M 1 7.357e-09 7.357e-09 48 0.5952 0.444206
> ---
> I hope that by providing a sample of the data someone can help me out
> on the questions I asked in my previous mail:
>
John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Aug 12 22:55:54 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:55:41 EST