From: Douglas Bates <bates_at_stat.wisc.edu>

Date: Thu, 17 Jul 2008 06:29:41 -0500

>> well, for computing the p-value you need to use pchisq() and dchisq() (check

*>> ?dchisq for more info). For model fits with a logLik method you can directly
*

*>> use the following simple function:
*

*>>
*

*>> lrt <- function (obj1, obj2) {
*

*>> L0 <- logLik(obj1)
*

*>> L1 <- logLik(obj2)
*

*>> L01 <- as.vector(- 2 * (L0 - L1))
*

*>> df <- attr(L1, "df") - attr(L0, "df")
*

*>> list(L01 = L01, df = df,
*

*>> "p-value" = pchisq(L01, df, lower.tail = FALSE))
*

*>> }
*

*>>
*

*>> library(lme4)
*

*>> gm0 <- glm(cbind(incidence, size - incidence) ~ period,
*

*>> family = binomial, data = cbpp)
*

*>> gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
*

*>> family = binomial, data = cbpp)
*

*>>
*

*>> lrt(gm0, gm1)
*

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 17 Jul 2008 - 11:33:49 GMT

Date: Thu, 17 Jul 2008 06:29:41 -0500

On Thu, Jul 17, 2008 at 2:50 AM, Rune Haubo <rhbc_at_imm.dtu.dk> wrote:

> 2008/7/16 Dimitris Rizopoulos <Dimitris.Rizopoulos_at_med.kuleuven.be>:

>> well, for computing the p-value you need to use pchisq() and dchisq() (check

> > Yes, that seems quite natural, but then try to compare with the deviance: > > logLik(gm0) > logLik(gm1) > > (d0 <- deviance(gm0)) > (d1 <- deviance(gm1)) > (LR <- d0 - d1) > pchisq(LR, 1, lower = FALSE) > > Obviously the deviance in glm is *not* twice the negative > log-likelihood as it is in glmer. The question remains which of these > two quantities is appropriate for comparison. I am not sure exactly > how the deviance and/or log-likelihood are calculated in glmer, but my > feeling is that one should trust the deviance rather than the > log-likelihoods for these purposes. This is supported by the following > comparison: Ad an arbitrary random effect with a close-to-zero > variance and note the deviance: > > tmp <- rep(1:4, each = nrow(cbpp)/4) > gm2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), > family = binomial, data = cbpp) > (d2 <- deviance(gm2)) > > This deviance is very close to that obtained from the glm model. > > I have included the mixed-models mailing list in the hope that someone > could explain how the deviance is computed in glmer and why deviances, > but not likelihoods are comparable to glm-fits.

In that example I think the problem may be that I have not yet written the code to adjust the deviance of the glmer fit for the null deviance.

>> However, there are some issues regarding this likelihood ratio test.

*>>
**>> 1) The null hypothesis is on the boundary of the parameter space, i.e., you
**>> test whether the variance for the random effect is zero. For this case the
**>> assumed chi-squared distribution for the LRT may *not* be totally
**>> appropriate and may produce conservative p-values. There is some theory
**>> regarding this issue, which has shown that the reference distribution for
**>> the LRT in this case is a mixture of a chi-squared(df = 0) and
**>> chi-squared(df = 1). Another option is to use simulation-based approach
**>> where you can approximate the reference distribution of the LRT under the
**>> null using simulation. You may check below for an illustration of this
**>> procedure (not-tested):
**>>
**>> X <- model.matrix(gm0)
**>> coefs <- coef(gm0)
**>> pr <- plogis(c(X %*% coefs))
**>> n <- length(pr)
**>> new.dat <- cbpp
**>> Tobs <- lrt(gm0, gm1)$L01
**>> B <- 200
**>> out.T <- numeric(B)
**>> for (b in 1:B) {
**>> y <- rbinom(n, cbpp$size, pr)
**>> new.dat$incidence <- y
**>> fit0 <- glm(formula(gm0), family = binomial, data = new.dat)
**>> fit1 <- glmer(formula(gm1), family = binomial, data = new.dat)
**>> out.T[b] <- lrt(fit0, fit1)$L01
**>> }
**>> # estimate p-value
**>> (sum(out.T >= Tobs) + 1) / (B + 1)
**>>
**>>
**>> 2) For the glmer fit you have to note that you work with an approximation to
**>> the log-likelihood (obtained using numerical integration) and not the actual
**>> log-likelihood.
**>>
**>> I hope it helps.
**>>
**>> Best,
**>> Dimitris
**>>
**>> ----
**>> Dimitris Rizopoulos
**>> Biostatistical Centre
**>> School of Public Health
**>> Catholic University of Leuven
**>>
**>> Address: Kapucijnenvoer 35, Leuven, Belgium
**>> Tel: +32/(0)16/336899
**>> Fax: +32/(0)16/337015
**>> Web: http://med.kuleuven.be/biostat/
**>> http://www.student.kuleuven.be/~m0390867/dimitris.htm
**>>
**>>
**>> Quoting COREY SPARKS <corey.sparks_at_UTSA.EDU>:
**>>
**>>> Dear list,
**>>> I am fitting a logistic multi-level regression model and need to test the
**>>> difference between the ordinary logistic regression from a glm() fit and
**>>> the mixed effects fit from glmer(), basically I want to do a likelihood
**>>> ratio test between the two fits.
**>>>
**>>>
**>>> The data are like this:
**>>> My outcome is a (1,0) for health status, I have several (1,0) dummy
**>>> variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male,
**>>> divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20
**>>> to 100).
**>>> My higher level is called munid and has 581 levels.
**>>> The data have 45243 observations.
**>>>
**>>> Here are my program statements:
**>>>
**>>> #GLM fit
**>>>
**>>> ph.fit.2<-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(),
**>>> data=mx.merge)
**>>> #GLMER fit
**>>>
**>>> ph.fit.3<-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(),
**>>> data=mx.merge)
**>>>
**>>> I cannot find a method in R that will do the LR test between a glm and a
**>>> glmer fit, so I try to do it using the liklihoods from both models
**>>>
**>>> #form the likelihood ratio test between the glm and glmer fits
**>>> x2<--2*(logLik(ph.fit.2)-logLik(ph.fit.3))
**>>>
**>>>> ML
**>>>
**>>> 79.60454
**>>> attr(,"nobs")
**>>> n
**>>> 45243
**>>> attr(,"nall")
**>>> n
**>>> 45243
**>>> attr(,"df")
**>>> [1] 14
**>>> attr(,"REML")
**>>> [1] FALSE
**>>> attr(,"class")
**>>> [1] "logLik"
**>>>
**>>> #Get the associated p-value
**>>> dchisq(x2,14)
**>>> ML
**>>>>
**>>>> 5.94849e-15
**>>>
**>>> Which looks like an improvement in model fit to me. Am I seeing this
**>>> correctly or are the two models even able to be compared? they are both
**>>> estimated via maximum likelihood, so they should be, I think.
**>>> Any help would be appreciated.
**>>>
**>>> Corey
**>>>
**>>> Corey S. Sparks, Ph.D.
**>>>
**>>> Assistant Professor
**>>> Department of Demography and Organization Studies
**>>> University of Texas San Antonio
**>>> One UTSA Circle
**>>> San Antonio, TX 78249
**>>> email:corey.sparks_at_utsa.edu
**>>> web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm
**>>>
**>>>
**>>> [[alternative HTML version deleted]]
**>>>
**>>> ______________________________________________
**>>> R-help_at_r-project.org mailing list
**>>> https://stat.ethz.ch/mailman/listinfo/r-help
**>>> PLEASE do read the posting guide
**>>> http://www.R-project.org/posting-guide.html
**>>> and provide commented, minimal, self-contained, reproducible code.
**>>>
**>>>
**>>
**>>
**>>
**>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
**>>
*

> > > > -- > Rune Haubo Bojesen Christensen > > Master Student, M.Sc. Eng. > Phone: (+45) 30 26 45 54 > Mail: rhbc_at_imm.dtu.dk, rune.haubo_at_gmail.com > > DTU Informatics, Section for Statistics > Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark > > ______________________________________________ > R-help_at_r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 17 Jul 2008 - 11:33:49 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 19 Jul 2008 - 22:32:03 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*