[R] Simulate p-value in lme4

From: Manuel Morales <Manuel.A.Morales_at_williams.edu>
Date: Fri 18 Aug 2006 - 06:49:31 EST


Dear list,

This is more of a stats question than an R question per se. First, I realize there has been a lot of discussion about the problems with estimating P-values from F-ratios for mixed-effects models in lme4. Using mcmcsamp() seems like a great alternative for evaluating the significance of individual coefficients, but not for groups of coefficients as might occur in an experimental design with 3 treatment levels. I'm wondering if the simulation approach I use below to estimate the P-value for a 3-level factor is appropriate, or if there are any suggestions on how else to approach this problem. The model and data in the example are from section 10.4 of MASS.

Thanks!
Manuel

# Load req. package (see functions to generate data at end of script)
library(lme4)
library(MASS)

# Full and reduced models - pred is a factor with 3 levels
result.full <- lmer(y~pred+(1|subject), data=epil3, family="poisson") result.base <- lmer(y~1+(1|subject), data=epil3, family="poisson")

# Naive P-value from LR for significance of "pred" factor
anova(result.base,result.full)$"Pr(>Chisq)"[[2]] # P-value (test.stat <- anova(result.base,result.full)$Chisq[[2]]) # Chisq-stat

# P-value from simulation. Note that in the simulation, I use the
# estimated random effects for each subject rather than generating a new
# distribution of means. I'm not sure if this is appropriate or not ...
intercept <- fixef(result.base)
rand.effs <- ranef(result.base)[[1]]
mu <- exp(rep(intercept+rand.effs[[1]],2))

p.value <- function(iter, stat) {
  chi.stat <- vector()
  for(i in 1:iter) {
    resp <- rpois(length(mu), mu) # simulate values     sim.data <- data.frame(y=resp,subject=epil3$subject,pred=epil3$pred)     result.f <- lmer(y~pred+(1|subject), data=sim.data,

                     family="poisson")

    result.b <- lmer(y~1+(1|subject), data=sim.data, family="poisson")     chi.stat[i] <- anova(result.b,result.f)$Chisq[[2]]   }
  val <- sum(unlist(lapply(chi.stat, function(x) if(x>stat) 1 else

             0)))/iter
  hist(chi.stat)
  return(val)
}

p.value(10,test.stat) # Increase to >=1000 to get a reasonable P-value!

# Script to generate data, from section 10.4 of MASS
epil2 <- epil[epil$period == 1, ]
epil2["period"] <- rep(0, 59); epil2["y"] <- epil2["base"] epil["time"] <- 1; epil2["time"] <- 4
epil2 <- rbind(epil, epil2)
epil2$pred <- unclass(epil2$trt) * (epil2$period > 0) epil2$subject <- factor(epil2$subject)
epil3 <- aggregate(epil2, list(epil2$subject, epil2$period > 0),

                   function(x) if(is.numeric(x)) sum(x) else x[1])
epil3$pred <- factor(epil3$pred, labels = c("base", "placebo", "drug"))

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Aug 18 06:55:29 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 21 Aug 2006 - 06:20:01 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.