Re: [R] nested mixed-effect model: variance components

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Sun 11 Jun 2006 - 02:16:54 EST

          I have seen no reply to this, so I will offer a couple of comments in spite of the fact that I know very little about "aov" other than it is old and has largely been superceded by "lme" in the "nlme" package. I've replied to many posts on random and mixed effects over the past few years, and I only remember one case where 'aov' returned an answer that could not have been obtained easily from 'lme'. That one application involved estimating a saturated model from perfectly balanced experimental data. 'lme' refused to return an answer, at least as I specified the model, and 'aov' returned numbers from which anyone familiar with the theory for that special case could compute the desired F ratios and p-values. In all other cases that I remember, 'aov' would either return the same answer, or much more commonly, a substantively inferior answer -- if it were feasible to use 'aov' at all.

          I mention this, because the simplest way I can think of to check your answer is to suggest you try to work it using 'lme'. To learn how to do this, I strongly encourage you to consult Pinheiro and Bates (2000) Mixed-Effects Models in S and S-PLUS (Springer). Bates has been for many years one of the leading contributors in this area and is the primary author of the 'nlme', 'lme4' and 'Matrix" packages to support working with these kinds of models. This book discusses both how to fit these kinds of models as well as how to produce plots of various kinds that can be very helpful for explaining the experimental results to others as well as diagnosing potential problem. Moreover, the R scripts discussed in the book are available in files called "ch01.R", "ch02.R", ..., "ch06.R", "ch08.R" in a subfolder '~library\nlme\scripts' of the directory in which R is installed on your hard drive. This makes it much easier to work through the examples in the book one line at a time, experimenting at with modifications. In addition, there is one point I know where the R syntax differs from that in the book: S-Plus will accept x^2 in a formula as meaning the square of a numeric variable; R will not. To specify a square in R, you need something like I(x^2). When I copied the commands out of the book, I had serious trouble getting answers like those in the book until I identified and corrected this difference in syntax. If you use the script files, they provide the R syntax.

          I'm not certain, but I believe the following should estimate the model you wrote below:

	  fit <- lme(fixed=COVER ~ HABITAT,
		    random = ~1|LAGOON/HABITAT,
		    data=cov).

	  For comparison, I refer you to Pinheiro and Bates, p. 47, where I 
find the following:

fm1Oats <- lme( yield ~ ordered(nitro) * Variety, data = Oats,

   random = ~ 1 | Block/Variety )

          There are 3 Varieties and 6 Blocks in this Oats data.frame. The fixed effect of Variety has therefore 2 degrees of freedom. However, the random effect of Variety occurs in 18 levels, being all the 6*3 Block:Variety combinations. You can see this by examining 'str(fm1Oats)'.

          If you want to know "how much variation is due to lagoons? habitats? lagoons*habitat? transects?", this model will NOT tell you that, and I don't know how to answer that question using 'lme'. I was able to answer a question like that using 'lmer' associated with the 'lme4' and 'Matrix' packages. Unfortunately, these packages have some names conflicts with 'nlme', and the simplest way I know to change from one to the other is to "q()" and restart R Before I did, however, I made a local copy of the "Oats" data.frame. After I did that, I seemed to get sensible numbers from the following:

library(lme4)

fm1Oats4 <- lmer(yield~ ordered(nitro) * Variety

         +(1|Block)+(1|Variety)+(1|Block:Variety),
                  data=Oats)

	  For both "lme" and "lmer", the default "method" is "REML" (restricted 
maximum likelihood). This is what you want for estimation. For testing random effects, you still want "REML", but you should adjust the degrees of freedom of the reference "F" distribution as discussed in section 2.4 of Pinheiro and Bates; this also applies to confidence intervals for the random effects. For testing fixed effects, you should use "method = 'ML'".

          "lme4" is newer than "nlme" and does not currently have available the complete set of helper functions for plotting, etc. Thus, you will likely want to use both. For documentation on 'lmer', you should still start with Pinheiro and Bates for the general theory, then refer to the vignettes associated with the "mlmRev" and "lme4" packages; if you don't know about vignettes RSiteSearch("graves vignette") will lead you to "http://finzi.psych.upenn.edu/R/Rhelp02a/archive/67006.html" and other replies to r-help where I've described how to use them. You will also want the relevant article by Doug Bates in R News. To find that, go "www.r-project.org" -> Documentation: Newsletter -> Download: "Table of Contents", then search for Bates until you find "Douglas Bates. Fitting linear mixed models in R. R News, 5(1):27-30, May 2005." That says you want vol. 5(1).

	  Hope this helps,
	  Spencer Graves

Eric Pante wrote:
> Dear listers,
>
> I am trying to assess variance components for a nested, mixed-effects
> model. I think I got an answer that make sense from R, but I have a
> warning message and I wanted to check that what I am looking at is
> actually what I need:
>
> my data are organized as transects within stations, stations within
> habitats, habitats within lagoons.
> lagoons: random, habitats: fixed
> the question is: how much variation is due to lagoons? habitats?
> lagoons*habitat? transects?
>
> Here is my code:
>
> res <- aov(COVER ~ HABITAT + Error(HABITAT+LAGOON+LAGOON/HABITAT),
> data=cov)
> summary(res)
>
> and I get Sum Sq for each to calculate variance components:
>
> Error: STRATE
> Df Sum Sq Mean Sq
> STRATE 5 4493.1 898.6
>
> Error: ATOLL
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 5 3340.5 668.1
>
> Error: STRATE:ATOLL
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 18 2442.71 135.71
>
> Error: Within
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 145 6422.0 44.3
>
> My error message seems to come from the LAGOON/HABITAT, the Error is
> computed.
> Warning message: Error() model is singular in: aov(COVER ~ HABITAT +
> Error(HABITAT+LAGOON+LAGOON/HABITAT), data=cov),
>
> THANKS !!!
> eric
>
>
>
> Eric Pante
> ----------------------------------------------------------------
> College of Charleston, Grice Marine Laboratory
> 205 Fort Johnson Road, Charleston SC 29412
> Phone: 843-953-9190 (lab) -9200 (main office)
> ----------------------------------------------------------------
>
> "On ne force pas la curiosite, on l'eveille ..."
> Daniel Pennac
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jun 11 02:24:40 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 11 Jun 2006 - 05:34:59 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.