Re: [R] F-Tests in generalized linear mixed models (GLMM)

From: Douglas Bates <bates_at_stat.wisc.edu>
Date: Wed, 19 Nov 2008 14:36:39 -0600

On Wed, Nov 19, 2008 at 9:16 AM, Björn Stollenwerk <bjoern.stollenwerk_at_helmholtz-muenchen.de> wrote:
> Thanks! I am talking about any kind of model comparison technique.
>
> However, some p-values based on F-Tests are supplied, based on the GLMM with
> Gamma distribution and log-link function (see output below). I think this
> can be done, because the parameter estimates are approximately normal
> distributed. However, the effort to perform some tests based on more than
> one variable failed.
>
>> anova(glm1.gamma$lme)
> numDF denDF F-value p-value
> X 3 295 2418.298 <.0001

How was glm1.gamma$lme generated and what is its class? If it was fit with lme it is not a generalized linear mixed model because lme doesn't fit such models.

>>
>> anova(glm1.gamma$gam)
>
> Family: Gamma
> Link function: log
>
> Formula:
> y ~ x1 + x2
>
> Parametric Terms:
> df F p-value
> x1 1 89.01 < 2e-16
> x2 1 10.62 0.00125
>
>
>
>>> Hi!
>>>
>>
>>
>>>
>>> I would like to perform an F-Test over more than one variable within a
>>> generalized mixed model with Gamma-distribution
>>> and log-link function.
>>>
>>
>> Are you using the phrase "F-Test" as a general term for model
>> comparison techniques like the analysis of variance or as a specific
>> type of test based on ratios of mean squares and the F distribution?
>> If the latter then you may need to reconsider your question. The F
>> statistic is derived from a normal (i.e. Gaussian) distribution of the
>> response vector. In certain balanced cases it can also be applied to
>> linear mixed models. As far as I know there is not a derivation of
>> the F statistic from a Gamma distribution, either with or without
>> random effects.
>>
>>
>>>
>>> For this purpose, I use the package mgcv.
>>> Similar tests may be done using the function "anova", as for example in
>>> the
>>> case of a normal
>>> distributed response. However, if I do so, the error message
>>> "error in eval(expr, envir, enclos) : object "fixed" not found" occures.
>>> Does anyone konw why, or how to fix the problem? To illustrate the
>>> problem,
>>> I send the output of a simulated example.
>>> Thank you very much in advance.
>>>
>>> Best regards, Björn
>>>
>>> Example:
>>>
>>>
>>>>
>>>> # simulation of data
>>>> n <- 300
>>>> x1 <- sample(c(T,F), n, replace=TRUE)
>>>> x2 <- rnorm(n)
>>>> random1 <- sample(c("level1","level2","level3"), n, replace=TRUE)
>>>> true.lp <- 5 + 1.1*x1 + 0.16 * x2
>>>> mu <- exp(true.lp)
>>>> sigma <- mu * 1
>>>> a <- mu^2/sigma^2
>>>> s <- sigma^2/mu
>>>> y <- rgamma(n, shape=a, scale=s)
>>>>
>>>> library(mgcv)
>>>>
>>>> # a mixed model without Gamma-distribution and without log-link works as
>>>> follows:
>>>> glmm1 <- gamm(y ~ x1 + x2, random=list(random1 = ~1))
>>>> glmm2 <- gamm(y ~ 1, random=list(random1 = ~1))
>>>>
>>>> anova(glmm1$lme)
>>>>
>>>
>>> numDF denDF F-value p-value
>>> X 3 295 103.4730 <.0001
>>>
>>>>
>>>> anova(glmm2$lme, glmm1$lme)
>>>>
>>>
>>> Model df AIC BIC logLik Test L.Ratio p-value
>>> glmm2$lme 1 3 4340.060 4351.172 -2167.030
>>> glmm1$lme 2 5 4292.517 4311.036 -2141.258 1 vs 2 51.54367 <.0001
>>>
>>>>
>>>> # a linear model also works, though no p-value is reported
>>>> glm1 <- gam(y ~ x1 + x2)
>>>> glm2 <- gam(y ~ 1)
>>>> anova(glm1)
>>>>
>>>
>>> Family: gaussian
>>> Link function: identity
>>>
>>> Formula:
>>> y ~ x1 + x2
>>>
>>> Parametric Terms:
>>> df F p-value
>>> x1 1 45.58 7.69e-11
>>> x2 1 13.96 0.000224
>>>
>>>
>>>>
>>>> anova(glm2, glm1)
>>>>
>>>
>>> Analysis of Deviance Table
>>>
>>> Model 1: y ~ 1
>>> Model 2: y ~ x1 + x2
>>> Resid. Df Resid. Dev Df Deviance
>>> 1 299 33024943
>>> 2 297 27811536 2 5213407
>>>
>>>>
>>>> # general linear models (GLM) with Gamma and log-link don't work
>>>> glm1.gamma <- gam(y ~ x1 + x2, family=Gamma(link="log"))
>>>> glm2.gamma <- gam(y ~ 1, family=Gamma(link="log"))
>>>> anova(glm1.gamma)
>>>>
>>>
>>> Family: Gamma
>>> Link function: log
>>>
>>> Formula:
>>> y ~ x1 + x2
>>>
>>> Parametric Terms:
>>> df F p-value
>>> x1 1 59.98 1.52e-13
>>> x2 1 16.06 7.78e-05
>>>
>>>
>>>>
>>>> anova(glm2.gamma, glm1.gamma)
>>>>
>>>
>>> Analysis of Deviance Table
>>>
>>> Model 1: y ~ 1
>>> Model 2: y ~ x1 + x2
>>> Resid. Df Resid. Dev Df Deviance
>>> 1 299 413.59
>>> 2 297 343.90 2 69.69
>>>
>>>>
>>>> # neither do general linear mixed models (GLMM)
>>>>
>>>> glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1),
>>>> family=Gamma(link="log"))
>>>>
>>>
>>> Maximum number of PQL iterations: 20
>>> iteration 1
>>>
>>>>
>>>> glm2.gamma <- gamm(y ~ 1, random=list(random1 = ~1),
>>>> family=Gamma(link="log"))
>>>>
>>>
>>> Maximum number of PQL iterations: 20
>>> iteration 1
>>>
>>>>
>>>> summary(glm1.gamma$lme)
>>>>
>>>
>>> Linear mixed-effects model fit by maximum likelihood
>>> Data: data
>>> AIC BIC logLik
>>> 847.722 866.241 -418.861
>>>
>>> Random effects:
>>> Formula: ~1 | random1
>>> (Intercept) Residual
>>> StdDev: 2.954058e-05 0.9775214
>>>
>>> Variance function:
>>> Structure: fixed weights
>>> Formula: ~invwt
>>> Fixed effects: list(fixed)
>>> Value Std.Error DF t-value p-value
>>> X(Intercept) 5.066376 0.08363392 295 60.57801 0e+00
>>> Xx1TRUE 0.884486 0.11421762 295 7.74387 0e+00
>>> Xx2 0.234537 0.05851689 295 4.00802 1e-04
>>> Correlation:
>>> X(Int) X1TRUE
>>> Xx1TRUE -0.733
>>> Xx2 -0.008 0.085
>>>
>>> Standardized Within-Group Residuals:
>>> Min Q1 Med Q3 Max
>>> -1.0207671 -0.6911364 -0.2899184 0.3665161 4.9603830
>>>
>>> Number of Observations: 300
>>> Number of Groups: 3
>>>
>>>>
>>>> summary(glm1.gamma$gam)
>>>>
>>>
>>> Family: Gamma
>>> Link function: log
>>>
>>> Formula:
>>> y ~ x1 + x2
>>>
>>> Parametric coefficients:
>>> Estimate Std. Error t value Pr(>|t|)
>>> (Intercept) 5.06638 0.08363 60.578 < 2e-16 ***
>>> x1TRUE 0.88449 0.11422 7.744 1.53e-13 ***
>>> x2 0.23454 0.05852 4.008 7.75e-05 ***
>>> ---
>>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>>
>>>
>>> R-sq.(adj) = 0.171 Scale est. = 0.95555 n = 300
>>>
>>>>
>>>> anova(glm1.gamma$lme)
>>>>
>>>
>>> numDF denDF F-value p-value
>>> X 3 295 3187.192 <.0001
>>>
>>>>
>>>> anova(glm2.gamma$lme, glm1.gamma$lme)
>>>>
>>>
>>> Fehler in eval(expr, envir, enclos) : objekt "fixed" nicht gefunden
>>> ______________________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>
>
> --
> Dr. rer. nat. Björn Stollenwerk
>
> Helmholtz Zentrum München (GmbH)
> Institut für Gesundheitsökonomie und
> Management im Gesundheitswesen
> Ingolstädter Landstraße 1
> D-85764 Neuherberg
>
> AG2: Ökonomische Evaluation
>
> Tel. +49 (0)89 3187 4161
> Fax +49 (0)89 3187 3375
>
> bjoern.stollenwerk_at_helmholtz-muenchen.de
> www.helmholtz-muenchen.de/igm
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 19 Nov 2008 - 20:38:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 20 Nov 2008 - 08:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive