# Re: [R] between-within anova: aov and lme

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Sat 12 Aug 2006 - 20:28:26 EST

To understand why this works, you need to understand the math in a more general formulation. Ordinary least squares can be written in matrix / vector notation as follows:

y = X %*% b + e,

where y and e are N x 1 vectors, X is an N x k matrix, and b is a k x 1 vector. In this formulation, e follows a multivariate normal distribution with mean 0 and covariance = s.e^2 times the N x N identity matrix.

For mixed effects, e is assumed to follow a multivariate normal distribution with a more general variance-covariance structure, specified in various ways as discussed in Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer). If e ~ N(0, W), then the maximum likelihood estimates for "b" in the above model can be written as follows:

b = inv(t(X) %*% solve(W, X)) %*% y.

As explained by Pinheiro and Bates, we estimate the fixed effects, "b", using maximum likelihood (ML) and parameters in "W" using "restricted maximum likelihood (REML)".

The standard analysis of variance is then obtained from the "likelihood ratio" for nested models. In certain special cases, a monotonic transformation of a likelihood ratio follows an F distribution with degrees of freedom computed from the ranks of various matrices. The approach provides a unified way of analyzing data with mixed effects that does not care if the design is balance or not.

Analyses following this method may not always give the same answers as textbooks that discuss standard balanced designs. However, I'm not prepared to discuss that.

```	  Hope this helps.
Spencer Graves

##############################################
```
William Simpson wrote:
> Hi Spencer
>
>> 'lme' is smart enough to figure out from the data whether a factor is
>> 'between' or 'within' or partially one or the other. This allows you
>> avoid worrying about that during data analysis -- except as a check on
>> factor coding.
> Just to check Spencer, the following lme() statement:
> lme(y~a*b*c,random=~1|s, data=d)
> will work for any combination of a,b,c as between or within factors.
At one extreme
> a,b,c could all be between subjects, at the other extreme a,b,c could
all be within
> subjects, and any other combo of between/within.
>
> That is a bit mind-bending. So far as lme is concerned all that
matters is that s is
> a random effect. It will probably be difficult to convince experimental
> psychologists who consider themselves to be experts in the
statistical analysis of
> experiments.
>
> Cheers
> Bill
>
```#################################
```
following with 'lme':

lme(response~A*B*C,random=~1|subject)

This assumes that A, B, and C are fixed effects, either continuous variables or factors present at only a very few levels whose effects are not reasonably modeled as a random sample from some other distribution.   It also assumes that the effect of each level of subject can be reasonable modeled as a random adjustment to the intercept following a common distribution with mean 0 and variance = 'var.subj'.

The function 'aov' is old and mostly obsoleted by 'nlme'. There may be things that can be done in 'aov' that can not be done more or less as easily and usually better and more generally with 'lme', but I'm not familiar with such cases.

Your question suggests you may not be familiar with Pinheiro and Bates (2000) Mixed-Effects Models in S and S-Plus (Springer). The standard R distribution comes with a directory "~library\nlme\scripts" containing script files 'ch01.R', 'ch02.R', ..., 'ch06.R', and 'ch08.R'.   These contain R script files with the R code for each chapter in the book. I've learned a lot from walking through the script files line by line while reviewing the corresponding text in the book. Doing so protects me from problems with silly typographical errors as well as subtle problems where the S-Plus syntax in the book gives a different answer in R because of the few differences in the syntax between S-Plus and R.

```	  Hope this helps.
Spencer Graves

```

William Simpson wrote:

```> I have 2 questions on ANOVA with 1 between subjects factor and 2 within factors.
>
> 1. I am confused on how to do the analysis with aov because I have seen two examples
> on the web with different solutions.
>
> a) Jon Baron (http://www.psych.upenn.edu/~baron/rpsych/rpsych.html) does
> 6.8.5 Example 5: Stevens pp. 468 - 474 (one between, two within)
>
> between: gp
> within: drug, dose
> aov(effect ~ gp * drug * dose + Error(subj/(dose*drug)), data=Ela.uni)
>
> b) Bill Venables answered a question on R help as follows.
>
> - factor A between subjects
> - factors B*C within subjects.
>
> aov(response ~ A*B*C + Error(subject), Kirk)
> "An alternative formula would be response ~ A/(B*C) + Error(subject), which
> would only change things by grouping together some of the sums of squares."
>
> -------------------------------------------------------
> SO: which should I do????
> aov(response ~ A*B*C + Error(subject), Kirk)
> aov(response ~ A/(B*C) + Error(subject), Kirk)
> aov(response ~ A*B*C + Error(subject/(B*C)), Kirk)
> --------------------------------------------------------
>
> 2. How would I do the analysis in lme()?
> Something like
> lme(response~A*B*C,random=~1|subject/(B*C))???
>
>
> Thanks very much for any help!
> Bill Simpson
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help