From: Göran Broström <goran.brostrom_at_gmail.com>

Date: Sat, 19 Jul 2008 23:51:42 +0200

>>> well, for computing the p-value you need to use pchisq() and dchisq() (check

*>>> ?dchisq for more info). For model fits with a logLik method you can directly
*

*>>> use the following simple function:
*

*>>>
*

*>>> lrt <- function (obj1, obj2) {
*

*>>> L0 <- logLik(obj1)
*

*>>> L1 <- logLik(obj2)
*

*>>> L01 <- as.vector(- 2 * (L0 - L1))
*

*>>> df <- attr(L1, "df") - attr(L0, "df")
*

*>>> list(L01 = L01, df = df,
*

*>>> "p-value" = pchisq(L01, df, lower.tail = FALSE))
*

*>>> }
*

*>>>
*

*>>> library(lme4)
*

*>>> gm0 <- glm(cbind(incidence, size - incidence) ~ period,
*

*>>> family = binomial, data = cbpp)
*

*>>> gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
*

*>>> family = binomial, data = cbpp)
*

*>>>
*

*>>> lrt(gm0, gm1)
*

>>> However, there are some issues regarding this likelihood ratio test.

*>>>
*

*>>> 1) The null hypothesis is on the boundary of the parameter space, i.e., you
*

*>>> test whether the variance for the random effect is zero. For this case the
*

*>>> assumed chi-squared distribution for the LRT may *not* be totally
*

*>>> appropriate and may produce conservative p-values. There is some theory
*

*>>> regarding this issue, which has shown that the reference distribution for
*

*>>> the LRT in this case is a mixture of a chi-squared(df = 0) and
*

*>>> chi-squared(df = 1). Another option is to use simulation-based approach
*

*>>> where you can approximate the reference distribution of the LRT under the
*

*>>> null using simulation. You may check below for an illustration of this
*

*>>> procedure (not-tested):
*

*>>>
*

*>>> X <- model.matrix(gm0)
*

*>>> coefs <- coef(gm0)
*

*>>> pr <- plogis(c(X %*% coefs))
*

*>>> n <- length(pr)
*

*>>> new.dat <- cbpp
*

*>>> Tobs <- lrt(gm0, gm1)$L01
*

*>>> B <- 200
*

*>>> out.T <- numeric(B)
*

*>>> for (b in 1:B) {
*

*>>> y <- rbinom(n, cbpp$size, pr)
*

*>>> new.dat$incidence <- y
*

*>>> fit0 <- glm(formula(gm0), family = binomial, data = new.dat)
*

*>>> fit1 <- glmer(formula(gm1), family = binomial, data = new.dat)
*

*>>> out.T[b] <- lrt(fit0, fit1)$L01
*

*>>> }
*

*>>> # estimate p-value
*

*>>> (sum(out.T >= Tobs) + 1) / (B + 1)
*

*>>>
*

*>>>
*

*>>> 2) For the glmer fit you have to note that you work with an approximation to
*

*>>> the log-likelihood (obtained using numerical integration) and not the actual
*

*>>> log-likelihood.
*

*>>>
*

*>>> I hope it helps.
*

*>>>
*

*>>> Best,
*

*>>> Dimitris
*

*>>>
*

*>>> ----
*

*>>> Dimitris Rizopoulos
*

*>>> Biostatistical Centre
*

*>>> School of Public Health
*

*>>> Catholic University of Leuven
*

*>>>
*

*>>> Address: Kapucijnenvoer 35, Leuven, Belgium
*

*>>> Tel: +32/(0)16/336899
*

*>>> Fax: +32/(0)16/337015
*

*>>> Web: http://med.kuleuven.be/biostat/
*

*>>> http://www.student.kuleuven.be/~m0390867/dimitris.htm
*

*>>>
*

*>>>
*

*>>> Quoting COREY SPARKS <corey.sparks_at_UTSA.EDU>:
*

*>>>
*

*>>>> Dear list,
*

*>>>> I am fitting a logistic multi-level regression model and need to test the
*

*>>>> difference between the ordinary logistic regression from a glm() fit and
*

*>>>> the mixed effects fit from glmer(), basically I want to do a likelihood
*

*>>>> ratio test between the two fits.
*

*>>>>
*

*>>>>
*

*>>>> The data are like this:
*

*>>>> My outcome is a (1,0) for health status, I have several (1,0) dummy
*

*>>>> variables RURAL, SMOKE, DRINK, EMPLOYED, highereduc, INDIG, male,
*

*>>>> divorced, SINGLE, chronic, vigor_d and moderat_d and AGE is continuous (20
*

*>>>> to 100).
*

*>>>> My higher level is called munid and has 581 levels.
*

*>>>> The data have 45243 observations.
*

*>>>>
*

*>>>> Here are my program statements:
*

*>>>>
*

*>>>> #GLM fit
*

*>>>>
*

*>>>> ph.fit.2<-glm(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d,family=binomial(),
*

*>>>> data=mx.merge)
*

*>>>> #GLMER fit
*

*>>>>
*

*>>>> ph.fit.3<-glmer(poorhealth~RURAL+SMOKE+DRINK+EMPLOYED+INSURANCE+highereduc+INDIG+AGE+male+divorced+SINGLE+chronic+vigor_d+moderat_d+(1|munid),family=binomial(),
*

*>>>> data=mx.merge)
*

*>>>>
*

*>>>> I cannot find a method in R that will do the LR test between a glm and a
*

*>>>> glmer fit, so I try to do it using the liklihoods from both models
*

*>>>>
*

*>>>> #form the likelihood ratio test between the glm and glmer fits
*

*>>>> x2<--2*(logLik(ph.fit.2)-logLik(ph.fit.3))
*

*>>>>
*

*>>>>> ML
*

*>>>>
*

*>>>> 79.60454
*

*>>>> attr(,"nobs")
*

*>>>> n
*

*>>>> 45243
*

*>>>> attr(,"nall")
*

*>>>> n
*

*>>>> 45243
*

*>>>> attr(,"df")
*

*>>>> [1] 14
*

*>>>> attr(,"REML")
*

*>>>> [1] FALSE
*

*>>>> attr(,"class")
*

*>>>> [1] "logLik"
*

*>>>>
*

*>>>> #Get the associated p-value
*

*>>>> dchisq(x2,14)
*

*>>>> ML
*

*>>>>>
*

*>>>>> 5.94849e-15
*

*>>>>
*

*>>>> Which looks like an improvement in model fit to me. Am I seeing this
*

*>>>> correctly or are the two models even able to be compared? they are both
*

*>>>> estimated via maximum likelihood, so they should be, I think.
*

*>>>> Any help would be appreciated.
*

*>>>>
*

*>>>> Corey
*

*>>>>
*

*>>>> Corey S. Sparks, Ph.D.
*

*>>>>
*

*>>>> Assistant Professor
*

*>>>> Department of Demography and Organization Studies
*

*>>>> University of Texas San Antonio
*

*>>>> One UTSA Circle
*

*>>>> San Antonio, TX 78249
*

*>>>> email:corey.sparks_at_utsa.edu
*

*>>>> web: https://rowdyspace.utsa.edu/users/ozd504/www/index.htm
*

*>>>>
*

*>>>>
*

*>>>> [[alternative HTML version deleted]]
*

*>>>>
*

*>>>> ______________________________________________
*

*>>>> R-help_at_r-project.org mailing list
*

*>>>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>>>> PLEASE do read the posting guide
*

*>>>> http://www.R-project.org/posting-guide.html
*

*>>>> and provide commented, minimal, self-contained, reproducible code.
*

*>>>>
*

*>>>>
*

*>>>
*

*>>>
*

*>>>
*

*>>> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
*

*>>>
*

*>>> ______________________________________________
*

*>>> R-help_at_r-project.org mailing list
*

*>>> https://stat.ethz.ch/mailman/listinfo/r-help
*

*>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
*

*>>> and provide commented, minimal, self-contained, reproducible code.
*

*>>>
*

Date: Sat, 19 Jul 2008 23:51:42 +0200

This particular case with a random intercept model can be handled by glmmML, by bootstrapping the p-value.

Best, Göran

On Thu, Jul 17, 2008 at 1:29 PM, Douglas Bates <bates_at_stat.wisc.edu> wrote:

> On Thu, Jul 17, 2008 at 2:50 AM, Rune Haubo <rhbc_at_imm.dtu.dk> wrote: >> 2008/7/16 Dimitris Rizopoulos <Dimitris.Rizopoulos_at_med.kuleuven.be>:

>>> well, for computing the p-value you need to use pchisq() and dchisq() (check

>> >> Yes, that seems quite natural, but then try to compare with the deviance: >> >> logLik(gm0) >> logLik(gm1) >> >> (d0 <- deviance(gm0)) >> (d1 <- deviance(gm1)) >> (LR <- d0 - d1) >> pchisq(LR, 1, lower = FALSE) >> >> Obviously the deviance in glm is *not* twice the negative >> log-likelihood as it is in glmer. The question remains which of these >> two quantities is appropriate for comparison. I am not sure exactly >> how the deviance and/or log-likelihood are calculated in glmer, but my >> feeling is that one should trust the deviance rather than the >> log-likelihoods for these purposes. This is supported by the following >> comparison: Ad an arbitrary random effect with a close-to-zero >> variance and note the deviance: >> >> tmp <- rep(1:4, each = nrow(cbpp)/4) >> gm2 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | tmp), >> family = binomial, data = cbpp) >> (d2 <- deviance(gm2)) >> >> This deviance is very close to that obtained from the glm model. >> >> I have included the mixed-models mailing list in the hope that someone >> could explain how the deviance is computed in glmer and why deviances, >> but not likelihoods are comparable to glm-fits. > > In that example I think the problem may be that I have not yet written > the code to adjust the deviance of the glmer fit for the null > deviance. >

>>> However, there are some issues regarding this likelihood ratio test.

>> >> >> >> -- >> Rune Haubo Bojesen Christensen >> >> Master Student, M.Sc. Eng. >> Phone: (+45) 30 26 45 54 >> Mail: rhbc_at_imm.dtu.dk, rune.haubo_at_gmail.com >> >> DTU Informatics, Section for Statistics >> Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark >> >> ______________________________________________ >> R-help_at_r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help_at_r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >

-- Göran Broström ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Sat 19 Jul 2008 - 21:56:37 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 19 Jul 2008 - 22:32:03 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*