Re: [R] F-Tests in generalized linear mixed models (GLMM)

From: Björn Stollenwerk <bjoern.stollenwerk_at_helmholtz-muenchen.de>
Date: Thu, 20 Nov 2008 08:55:42 +0100

It was fitted with the function gamm (generalized additive mixed model) of the package mgcv:

glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1), family=Gamma(link="log"))

However, it's just a generalized linear mixed model (GLMM), because "smooth" terms are missing.

Douglas Bates schrieb:

> On Wed, Nov 19, 2008 at 9:16 AM, Björn Stollenwerk
> <bjoern.stollenwerk_at_helmholtz-muenchen.de> wrote:
>   

>> Thanks! I am talking about any kind of model comparison technique.
>>
>> However, some p-values based on F-Tests are supplied, based on the GLMM with
>> Gamma distribution and log-link function (see output below). I think this
>> can be done, because the parameter estimates are approximately normal
>> distributed. However, the effort to perform some tests based on more than
>> one variable failed.
>>
>>
>>> anova(glm1.gamma$lme)
>>>
>> numDF denDF F-value p-value
>> X 3 295 2418.298 <.0001
>>
>
> How was glm1.gamma$lme generated and what is its class?  If it was fit
> with lme it is not a generalized linear mixed model because lme
> doesn't fit such models.
>
>   

>>> anova(glm1.gamma$gam)
>>>
>> Family: Gamma
>> Link function: log
>>
>> Formula:
>> y ~ x1 + x2
>>
>> Parametric Terms:
>> df F p-value
>> x1 1 89.01 < 2e-16
>> x2 1 10.62 0.00125
>>
>>
>>
>>
>>>> Hi!
>>>>
>>>>
>>>
>>>> I would like to perform an F-Test over more than one variable within a
>>>> generalized mixed model with Gamma-distribution
>>>> and log-link function.
>>>>
>>>>
>>> Are you using the phrase "F-Test" as a general term for model
>>> comparison techniques like the analysis of variance or as a specific
>>> type of test based on ratios of mean squares and the F distribution?
>>> If the latter then you may need to reconsider your question. The F
>>> statistic is derived from a normal (i.e. Gaussian) distribution of the
>>> response vector. In certain balanced cases it can also be applied to
>>> linear mixed models. As far as I know there is not a derivation of
>>> the F statistic from a Gamma distribution, either with or without
>>> random effects.
>>>
>>>
>>>
>>>> For this purpose, I use the package mgcv.
>>>> Similar tests may be done using the function "anova", as for example in
>>>> the
>>>> case of a normal
>>>> distributed response. However, if I do so, the error message
>>>> "error in eval(expr, envir, enclos) : object "fixed" not found" occures.
>>>> Does anyone konw why, or how to fix the problem? To illustrate the
>>>> problem,
>>>> I send the output of a simulated example.
>>>> Thank you very much in advance.
>>>>
>>>> Best regards, Björn
>>>>
>>>> Example:
>>>>
>>>>
>>>>
>>>>> # simulation of data
>>>>> n <- 300
>>>>> x1 <- sample(c(T,F), n, replace=TRUE)
>>>>> x2 <- rnorm(n)
>>>>> random1 <- sample(c("level1","level2","level3"), n, replace=TRUE)
>>>>> true.lp <- 5 + 1.1*x1 + 0.16 * x2
>>>>> mu <- exp(true.lp)
>>>>> sigma <- mu * 1
>>>>> a <- mu^2/sigma^2
>>>>> s <- sigma^2/mu
>>>>> y <- rgamma(n, shape=a, scale=s)
>>>>>
>>>>> library(mgcv)
>>>>>
>>>>> # a mixed model without Gamma-distribution and without log-link works as
>>>>> follows:
>>>>> glmm1 <- gamm(y ~ x1 + x2, random=list(random1 = ~1))
>>>>> glmm2 <- gamm(y ~ 1, random=list(random1 = ~1))
>>>>>
>>>>> anova(glmm1$lme)
>>>>>
>>>>>
>>>> numDF denDF F-value p-value
>>>> X 3 295 103.4730 <.0001
>>>>
>>>>
>>>>> anova(glmm2$lme, glmm1$lme)
>>>>>
>>>>>
>>>> Model df AIC BIC logLik Test L.Ratio p-value
>>>> glmm2$lme 1 3 4340.060 4351.172 -2167.030
>>>> glmm1$lme 2 5 4292.517 4311.036 -2141.258 1 vs 2 51.54367 <.0001
>>>>
>>>>
>>>>> # a linear model also works, though no p-value is reported
>>>>> glm1 <- gam(y ~ x1 + x2)
>>>>> glm2 <- gam(y ~ 1)
>>>>> anova(glm1)
>>>>>
>>>>>
>>>> Family: gaussian
>>>> Link function: identity
>>>>
>>>> Formula:
>>>> y ~ x1 + x2
>>>>
>>>> Parametric Terms:
>>>> df F p-value
>>>> x1 1 45.58 7.69e-11
>>>> x2 1 13.96 0.000224
>>>>
>>>>
>>>>
>>>>> anova(glm2, glm1)
>>>>>
>>>>>
>>>> Analysis of Deviance Table
>>>>
>>>> Model 1: y ~ 1
>>>> Model 2: y ~ x1 + x2
>>>> Resid. Df Resid. Dev Df Deviance
>>>> 1 299 33024943
>>>> 2 297 27811536 2 5213407
>>>>
>>>>
>>>>> # general linear models (GLM) with Gamma and log-link don't work
>>>>> glm1.gamma <- gam(y ~ x1 + x2, family=Gamma(link="log"))
>>>>> glm2.gamma <- gam(y ~ 1, family=Gamma(link="log"))
>>>>> anova(glm1.gamma)
>>>>>
>>>>>
>>>> Family: Gamma
>>>> Link function: log
>>>>
>>>> Formula:
>>>> y ~ x1 + x2
>>>>
>>>> Parametric Terms:
>>>> df F p-value
>>>> x1 1 59.98 1.52e-13
>>>> x2 1 16.06 7.78e-05
>>>>
>>>>
>>>>
>>>>> anova(glm2.gamma, glm1.gamma)
>>>>>
>>>>>
>>>> Analysis of Deviance Table
>>>>
>>>> Model 1: y ~ 1
>>>> Model 2: y ~ x1 + x2
>>>> Resid. Df Resid. Dev Df Deviance
>>>> 1 299 413.59
>>>> 2 297 343.90 2 69.69
>>>>
>>>>
>>>>> # neither do general linear mixed models (GLMM)
>>>>>
>>>>> glm1.gamma <- gamm(y ~ x1 + x2, random=list(random1 = ~1),
>>>>> family=Gamma(link="log"))
>>>>>
>>>>>
>>>> Maximum number of PQL iterations: 20
>>>> iteration 1
>>>>
>>>>
>>>>> glm2.gamma <- gamm(y ~ 1, random=list(random1 = ~1),
>>>>> family=Gamma(link="log"))
>>>>>
>>>>>
>>>> Maximum number of PQL iterations: 20
>>>> iteration 1
>>>>
>>>>
>>>>> summary(glm1.gamma$lme)
>>>>>
>>>>>
>>>> Linear mixed-effects model fit by maximum likelihood
>>>> Data: data
>>>> AIC BIC logLik
>>>> 847.722 866.241 -418.861
>>>>
>>>> Random effects:
>>>> Formula: ~1 | random1
>>>> (Intercept) Residual
>>>> StdDev: 2.954058e-05 0.9775214
>>>>
>>>> Variance function:
>>>> Structure: fixed weights
>>>> Formula: ~invwt
>>>> Fixed effects: list(fixed)
>>>> Value Std.Error DF t-value p-value
>>>> X(Intercept) 5.066376 0.08363392 295 60.57801 0e+00
>>>> Xx1TRUE 0.884486 0.11421762 295 7.74387 0e+00
>>>> Xx2 0.234537 0.05851689 295 4.00802 1e-04
>>>> Correlation:
>>>> X(Int) X1TRUE
>>>> Xx1TRUE -0.733
>>>> Xx2 -0.008 0.085
>>>>
>>>> Standardized Within-Group Residuals:
>>>> Min Q1 Med Q3 Max
>>>> -1.0207671 -0.6911364 -0.2899184 0.3665161 4.9603830
>>>>
>>>> Number of Observations: 300
>>>> Number of Groups: 3
>>>>
>>>>
>>>>> summary(glm1.gamma$gam)
>>>>>
>>>>>
>>>> Family: Gamma
>>>> Link function: log
>>>>
>>>> Formula:
>>>> y ~ x1 + x2
>>>>
>>>> Parametric coefficients:
>>>> Estimate Std. Error t value Pr(>|t|)
>>>> (Intercept) 5.06638 0.08363 60.578 < 2e-16 ***
>>>> x1TRUE 0.88449 0.11422 7.744 1.53e-13 ***
>>>> x2 0.23454 0.05852 4.008 7.75e-05 ***
>>>> ---
>>>> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>>>
>>>>
>>>> R-sq.(adj) = 0.171 Scale est. = 0.95555 n = 300
>>>>
>>>>
>>>>> anova(glm1.gamma$lme)
>>>>>
>>>>>
>>>> numDF denDF F-value p-value
>>>> X 3 295 3187.192 <.0001
>>>>
>>>>
>>>>> anova(glm2.gamma$lme, glm1.gamma$lme)
>>>>>
>>>>>
>>>> Fehler in eval(expr, envir, enclos) : objekt "fixed" nicht gefunden
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>
>> --
>> Dr. rer. nat. Björn Stollenwerk
>>
>> Helmholtz Zentrum München (GmbH)
>> Institut für Gesundheitsökonomie und
>> Management im Gesundheitswesen
>> Ingolstädter Landstraße 1
>> D-85764 Neuherberg
>>
>> AG2: Ökonomische Evaluation
>>
>> Tel. +49 (0)89 3187 4161
>> Fax +49 (0)89 3187 3375
>>
>> bjoern.stollenwerk_at_helmholtz-muenchen.de
>> www.helmholtz-muenchen.de/igm
>>
>>
>>
>
>
-- 
Dr. rer. nat. Björn Stollenwerk

Helmholtz Zentrum München (GmbH)
Institut für Gesundheitsökonomie und
Management im Gesundheitswesen
Ingolstädter Landstraße 1
D-85764 Neuherberg

AG2: Ökonomische Evaluation

Tel. +49 (0)89 3187 4161
Fax  +49 (0)89 3187 3375

bjoern.stollenwerk_at_helmholtz-muenchen.de
www.helmholtz-muenchen.de/igm

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 20 Nov 2008 - 08:07:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 20 Nov 2008 - 08:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive