Re: [R] lmer binomial model overestimating data?

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Sun 18 Jun 2006 - 03:09:04 EST

<see inline>

Martin Henry H. Stevens wrote:
> Hi folks,
> Warning: I don't know if the result I am getting makes sense, so this
> may be a statistics question.
>
> The fitted values from my binomial lmer mixed model seem to
> consistently overestimate the cell means, and I don't know why. I
> assume I am doing something stupid.

          I copied your "Raw.Data" and "Fitted.Estimates" into two columns of a data.frame OE and then did a t test as follows:  > with(OE, t.test(Raw.Data-Fitted.Estimates))

        One Sample t-test

data: Raw.Data - Fitted.Estimates
t = -2.1662, df = 11, p-value = 0.05313
alternative hypothesis: true mean is not equal to 0 95 percent confidence interval:
  -0.0991997752 0.0007897752
sample estimates:
mean of x
-0.049205

          I also tried the same thing with a modification of an example in the "lmer" help file. In that case, lmer seemed to consistently UNDERestimate the cell means. However, the difference again was not statistically significant.

          I suggest you consider this a disguised Rorschach ink blot test and worry about other things.

	  Hope this helps.
	  Spencer Graves	

>
> Below I include code, and a binary image of the data is available at
> this link:
> http://www.cas.muohio.edu/~stevenmh/tf.RdataBin
>
> This was done with `Matrix' version 0.995-10 and `lme4' version
> 0.995-2. and R v. 2.3.1 on a Mac, OS 10.4.6.
>
> The binomial model below ("mod") was reduced from a more complex one
> by first using AIC, BIC and LRT for "random" effects, and then
> relying on Helmert contrasts and AIC, BIC, and LRT to simplify fixed
> effects. Maybe this was wrong?
>
> > load("tf.RdataBin")
> > library(lme4)
>
> > options(contrasts=c("contr.helmert", "contr.poly"))
> > mod <- lmer(tfb ~ reg+nutrient+amd +reg:nutrient+
> + (1|rack) + (1|popu) + (1|gen), data=dat.tb2, family=binomial,
> method="Laplace")
> > summary(mod)
> Generalized linear mixed model fit using Laplace
> Formula: tfb ~ reg + nutrient + amd + reg:nutrient + (1 | rack) + (1
> | popu) + (1 | gen)
> Data: dat.tb2
> Family: binomial(logit link)
> AIC BIC logLik deviance
> 402.53 446.64 -191.26 382.53
> Random effects:
> Groups Name Variance Std.Dev.
> gen (Intercept) 0.385 0.621
> popu (Intercept) 0.548 0.741
> rack (Intercept) 0.401 0.633
> number of obs: 609, groups: gen, 24; popu, 9; rack, 2
>
> Estimated scale (compare to 1) 0.80656
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 2.391 0.574 4.17 3.1e-05
> reg1 0.842 0.452 1.86 0.06252
> reg2 0.800 0.241 3.32 0.00091
> nutrient1 0.788 0.197 4.00 6.3e-05
> amd1 -0.540 0.139 -3.88 0.00010
> reg1:nutrient1 0.500 0.227 2.21 0.02734
> reg2:nutrient1 -0.176 0.146 -1.21 0.22794
>
> Correlation of Fixed Effects:
> (Intr) reg1 reg2 ntrnt1 amd1 rg1:n1
> reg1 0.169
> reg2 -0.066 -0.191
> nutrient1 0.178 0.231 -0.034
> amd1 -0.074 -0.044 -0.052 -0.078
> reg1:ntrnt1 0.157 0.307 -0.180 0.562 -0.002
> reg2:ntrnt1 -0.028 -0.154 0.236 0.141 0.033 -0.378
> > X <- mod @ X
> > fitted <- X %*% fixef(mod)
> >
> > unlogitH <- function(x) {( 1 + exp(-x) )^-1}
> > (result <- data.frame(Raw.Data=with(dat.tb2,
> + tapply(tfb, list(reg:nutrient:amd),
> + mean ) ),
> + Fitted.Estimates=with(dat.tb2,
> + tapply(fitted, list(reg:nutrient:amd),
> + function(x) unlogitH(mean(x)) ) ) ))
> Raw.Data Fitted.Estimates
> SW:1:unclipped 0.50877 0.69520
> SW:1:clipped 0.41304 0.43669
> SW:8:unclipped 0.67273 0.85231
> SW:8:clipped 0.52830 0.66233
> NL:1:unclipped 0.88889 0.81887
> NL:1:clipped 0.53571 0.60578
> NL:8:unclipped 0.96552 0.98830
> NL:8:clipped 0.96154 0.96635
> SP:1:unclipped 0.98649 0.98361
> SP:1:clipped 0.92537 0.95328
> SP:8:unclipped 1.00000 0.99308
> SP:8:clipped 0.95890 0.97992
> > ### Perhaps the cell SP:8:clipped = 1.0 is messing up the fit?
> > pdf("RawAndFitted.pdf")
> > par(mar=c(8,3,2,2), las=2)
> > barplot(t(result), beside=TRUE )
> > box(); title("Fractions of Plants Producing Fruits")
> > legend("topleft", c("Raw Data", "Fitted Values"),
> + fill=gray.colors(2), bty="n" )
> > dev.off()
> quartz
> 2
> >
>
> _
> platform powerpc-apple-darwin8.6.0
> arch powerpc
> os darwin8.6.0
> system powerpc, darwin8.6.0
> status
> major 2
> minor 3.1
> year 2006
> month 06
> day 01
> svn rev 38247
> language R
> version.string Version 2.3.1 (2006-06-01)
> >
>
> Dr. M. Hank H. Stevens, Assistant Professor
> 338 Pearson Hall
> Botany Department
> Miami University
> Oxford, OH 45056
>
> Office: (513) 529-4206
> Lab: (513) 529-4262
> FAX: (513) 529-4243
> http://www.cas.muohio.edu/~stevenmh/
> http://www.muohio.edu/ecology/
> http://www.muohio.edu/botany/
> "E Pluribus Unum"
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jun 18 03:14:47 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 18 Jun 2006 - 06:10:57 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.