Re: [R] Covariates in LME?

From: Douglas Bates <>
Date: Thu, 27 Mar 2008 15:44:44 -0500

On Thu, Mar 27, 2008 at 7:01 AM, Aberg Carl <> wrote: > Hi,
> Im using lme to calculate a mixed factors ANOVA according to:

> px_anova = anova(lme(dep~music*time*group, random = ~1|id, data = px_data))

> where

> dep is a threshold,
> time is a repeated measures variable (2 levels)
> group is a between subjects variable (2 levels)
> id is a random factor (subject id)
> music is a between subjects variable (2 levels) indicating if a person has a musical experience, or not

> Musical experience is now decided by categorizing depending on the number of years practicing playing an instrument.

> I would like to use the years of playing an instrument as a covariate instead of creating categories.

Hmm. Your question can be answered on the level of tactics (i.e. an immediate response to the question that was asked) or on the level of strategy (considering why are you asking the question in the first place).

The tactics answer is just to use the numeric variable, say 'years', instead of the factor 'music'. The formula language for linear models in R is very flexible and is described in many of the books that are listed on the "Books" link at

The strategy answer would address the question of why you are writing the model as dep ~ music*time*group and why you don't save the fitted model but instead immediately pass it to the anova function. It seems that you are approaching the problem as a special type of ANOVA problem so the only items of interest are the F statistics and p-values in an ANOVA table. The more common approach in R is to model your data, first by plotting the data so you can formulate an initial model, then fitting that model, examining residual plots and other diagnostics, and modifying the model if indicated. Only after that process has converged on a model that seems reasonable does one calculate inferential statistics such as p-values.

The inferential statistics are always based on mathematical models of the data and will be misleading unless the model is appropriate. The model is never "correct". As George Box famously said, "All models are wrong; however, some models are useful." mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Fri 28 Mar 2008 - 05:11:13 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 28 Mar 2008 - 06:30:25 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive