[R] Simulate p-value in lme4

From: Manuel Morales <Manuel.A.Morales_at_williams.edu>
Date: Fri 18 Aug 2006 - 06:49:31 EST

Dear list,

This is more of a stats question than an R question per se. First, I realize there has been a lot of discussion about the problems with estimating P-values from F-ratios for mixed-effects models in lme4. Using mcmcsamp() seems like a great alternative for evaluating the significance of individual coefficients, but not for groups of coefficients as might occur in an experimental design with 3 treatment levels. I'm wondering if the simulation approach I use below to estimate the P-value for a 3-level factor is appropriate, or if there are any suggestions on how else to approach this problem. The model and data in the example are from section 10.4 of MASS.


# Load req. package (see functions to generate data at end of script)

# Full and reduced models - pred is a factor with 3 levels
result.full <- lmer(y~pred+(1|subject), data=epil3, family="poisson") result.base <- lmer(y~1+(1|subject), data=epil3, family="poisson")

# Naive P-value from LR for significance of "pred" factor
anova(result.base,result.full)$"Pr(>Chisq)"[[2]] # P-value (test.stat <- anova(result.base,result.full)$Chisq[[2]]) # Chisq-stat

# P-value from simulation. Note that in the simulation, I use the
# estimated random effects for each subject rather than generating a new
# distribution of means. I'm not sure if this is appropriate or not ...
intercept <- fixef(result.base)
rand.effs <- ranef(result.base)[[1]]
mu <- exp(rep(intercept+rand.effs[[1]],2))

p.value <- function(iter, stat) {
  chi.stat <- vector()
  for(i in 1:iter) {
    resp <- rpois(length(mu), mu) # simulate values     sim.data <- data.frame(y=resp,subject=epil3$subject,pred=epil3$pred)     result.f <- lmer(y~pred+(1|subject), data=sim.data,


    result.b <- lmer(y~1+(1|subject), data=sim.data, family="poisson")     chi.stat[i] <- anova(result.b,result.f)$Chisq[[2]]   }
  val <- sum(unlist(lapply(chi.stat, function(x) if(x>stat) 1 else


p.value(10,test.stat) # Increase to >=1000 to get a reasonable P-value!

# Script to generate data, from section 10.4 of MASS
epil2 <- epil[epil$period == 1, ]
epil2["period"] <- rep(0, 59); epil2["y"] <- epil2["base"] epil["time"] <- 1; epil2["time"] <- 4
epil2 <- rbind(epil, epil2)
epil2$pred <- unclass(epil2$trt) * (epil2$period > 0) epil2$subject <- factor(epil2$subject)
epil3 <- aggregate(epil2, list(epil2$subject, epil2$period > 0),

                   function(x) if(is.numeric(x)) sum(x) else x[1])
epil3$pred <- factor(epil3$pred, labels = c("base", "placebo", "drug"))

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Aug 18 06:55:29 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 21 Aug 2006 - 06:20:01 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.