From: Rune Haubo <rhbc_at_imm.dtu.dk>

Date: Thu, 17 Jul 2008 09:50:14 +0200

pchisq(LR, 1, lower = FALSE)

Date: Thu, 17 Jul 2008 09:50:14 +0200

2008/7/16 Dimitris Rizopoulos <Dimitris.Rizopoulos_at_med.kuleuven.be>:

> well, for computing the p-value you need to use pchisq() and dchisq() (check

*> ?dchisq for more info). For model fits with a logLik method you can directly
**> use the following simple function:
**>
**> lrt <- function (obj1, obj2) {
**> L0 <- logLik(obj1)
**> L1 <- logLik(obj2)
**> L01 <- as.vector(- 2 * (L0 - L1))
**> df <- attr(L1, "df") - attr(L0, "df")
**> list(L01 = L01, df = df,
**> "p-value" = pchisq(L01, df, lower.tail = FALSE))
**> }
**>
**> library(lme4)
**> gm0 <- glm(cbind(incidence, size - incidence) ~ period,
**> family = binomial, data = cbpp)
**> gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
**> family = binomial, data = cbpp)
**>
**> lrt(gm0, gm1)
*

Yes, that seems quite natural, but then try to compare with the deviance:

logLik(gm0)

logLik(gm1)

(d0 <- deviance(gm0)) (d1 <- deviance(gm1)) (LR <- d0 - d1)

pchisq(LR, 1, lower = FALSE)

Obviously the deviance in glm is *not* twice the negative log-likelihood as it is in glmer. The question remains which of these two quantities is appropriate for comparison. I am not sure exactly how the deviance and/or log-likelihood are calculated in glmer, but my feeling is that one should trust the deviance rather than the log-likelihoods for these purposes. This is supported by the following comparison: Ad an arbitrary random effect with a close-to-zero variance and note the deviance:

tmp <- rep(1:4, each = nrow(cbpp)/4)

gm2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp),

family = binomial, data = cbpp) (d2 <- deviance(gm2))

This deviance is very close to that obtained from the glm model.

I have included the mixed-models mailing list in the hope that someone could explain how the deviance is computed in glmer and why deviances, but not likelihoods are comparable to glm-fits.

Best

Rune

*>
**>
*

> However, there are some issues regarding this likelihood ratio test.

*>
**> 1) The null hypothesis is on the boundary of the parameter space, i.e., you
**> test whether the variance for the random effect is zero. For this case the
**> assumed chi-squared distribution for the LRT may *not* be totally
**> appropriate and may produce conservative p-values. There is some theory
**> regarding this issue, which has shown that the reference distribution for
**> the LRT in this case is a mixture of a chi-squared(df = 0) and
**> chi-squared(df = 1). Another option is to use simulation-based approach
**> where you can approximate the reference distribution of the LRT under the
**> null using simulation. You may check below for an illustration of this
**> procedure (not-tested):
**>
**> X <- model.matrix(gm0)
**> coefs <- coef(gm0)
**> pr <- plogis(c(X %*% coefs))
**> n <- length(pr)
**> new.dat <- cbpp
**> Tobs <- lrt(gm0, gm1)$L01
**> B <- 200
**> out.T <- numeric(B)
**> for (b in 1:B) {
**> y <- rbinom(n, cbpp$size, pr)
**> new.dat$incidence <- y
**> fit0 <- glm(formula(gm0), family = binomial, data = new.dat)
**> fit1 <- glmer(formula(gm1), family = binomial, data = new.dat)
**> out.T[b] <- lrt(fit0, fit1)$L01
**> }
**> # estimate p-value
**> (sum(out.T >= Tobs) + 1) / (B + 1)
**>
**>
**> 2) For the glmer fit you have to note that you work with an approximation to
**> the log-likelihood (obtained using numerical integration) and not the actual
**> log-likelihood.
**>
**> I hope it helps.
**>
**> Best,
**> Dimitris
**>
**> ----
**> Dimitris Rizopoulos
**> Biostatistical Centre
**> School of Public Health
**> Catholic University of Leuven
**>
**> Address: Kapucijnenvoer 35, Leuven, Belgium
**> Tel: +32/(0)16/336899
**> Fax: +32/(0)16/337015
**> Web: http://med.kuleuven.be/biostat/
**> http://www.student.kuleuven.be/~m0390867/dimitris.htm
**>
**>
**> Quoting COREY SPARKS <corey.sparks_at_UTSA.EDU>:
**>
**>> Dear list,
**>> I am fitting a logistic multi-level regression model and need to test the
**>> difference between the ordinary logistic regression from a glm() fit and
**>> the mixed effects fit from glmer(), basically I want to do a likelihood
**>> ratio test between the two fits.
**>>
**>>
**>> The data are like this:
**>> My outcome is a (1,0) for health status, I have several (1,0) dummy
**>> variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male,
**>> divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20
**>> to 100).
**>> My higher level is called munid and has 581 levels.
**>> The data have 45243 observations.
**>>
**>> Here are my program statements:
**>>
**>> #GLM fit
**>>
**>> ph.fit.2<-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(),
**>> data=mx.merge)
**>> #GLMER fit
**>>
**>> ph.fit.3<-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(),
**>> data=mx.merge)
**>>
**>> I cannot find a method in R that will do the LR test between a glm and a
**>> glmer fit, so I try to do it using the liklihoods from both models
**>>
**>> #form the likelihood ratio test between the glm and glmer fits
**>> x2<--2*(logLik(ph.fit.2)-logLik(ph.fit.3))
**>>
**>>> ML
**>>
**>> 79.60454
**>> attr(,"nobs")
**>> n
**>> 45243
**>> attr(,"nall")
**>> n
**>> 45243
**>> attr(,"df")
**>> [1] 14
**>> attr(,"REML")
**>> [1] FALSE
**>> attr(,"class")
**>> [1] "logLik"
**>>
**>> #Get the associated p-value
**>> dchisq(x2,14)
**>> ML
**>>>
**>>> 5.94849e-15
**>>
**>> Which looks like an improvement in model fit to me. Am I seeing this
**>> correctly or are the two models even able to be compared? they are both
**>> estimated via maximum likelihood, so they should be, I think.
**>> Any help would be appreciated.
**>>
**>> Corey
**>>
**>> Corey S. Sparks, Ph.D.
**>>
**>> Assistant Professor
**>> Department of Demography and Organization Studies
**>> University of Texas San Antonio
**>> One UTSA Circle
**>> San Antonio, TX 78249
**>> email:corey.sparks_at_utsa.edu
**>> web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm
**>>
**>>
**>> [[alternative HTML version deleted]]
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide
**>> http://www.R-project.org/posting-guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
**>>
**>>
**>
**>
**>
**> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

-- Rune Haubo Bojesen Christensen Master Student, M.Sc. Eng. Phone: (+45) 30 26 45 54 Mail: rhbc_at_imm.dtu.dk, rune.haubo_at_gmail.com DTU Informatics, Section for Statistics Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Thu 17 Jul 2008 - 07:57:42 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Thu 17 Jul 2008 - 12:31:58 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*