[R] glm with variance = mu+theta*mu^2?

From: Mwalili, S. M. <samuel_mwalili_at_yahoo.com>
Date: Sun 12 Jun 2005 - 21:10:15 EST


You can fit negative binomial using the 'zicounts' package library(zicounts)  

 data(teeth)
 names(teeth)

   ## c) fit negative binomial regression model     nb.zc <- zicounts(resp = dmft~.,x =~gender + age,data=teeth, distr = "NB")      nb.zc

Even,  

library(zicounts)
 library(Fahrmeir) # use cells data
data(cells)
nb.cells <- zicounts(parm=c(2,0,0,0,1),resp = y~.,x =~TNF+IFN+TNF:IFN,data=cells, distr = "NB")  nb.cells    

Samuel.

Kjetil Brinchmann Halvorsen <kjetil@acelerate.com> wrote: Spencer Graves wrote:

> How might you fit a generalized linear model (glm) with variance
> = mu+theta*mu^2 (where mu = mean of the exponential family random
> variable and theta is a parameter to be estimated)?
>
> This appears in Table 2.7 of Fahrmeir and Tutz (2001)
> Multivariate Statisticial Modeling Based on Generalized Linear Models,
> 2nd ed. (Springer, p. 60), where they compare "log-linear model fits
> to cellular differentiation data based on quasi-likelihoods" between
> variance = phi*mu (quasi-Poisson), variance = phi*mu^2
> (quasi-exponential), and variance = mu+theta*mu^2. The "quasi"
> function accepted for the family argument in "glm" generates functions
> "variance", "validmu", and "dev.resids". I can probably write
> functions to mimic the "quasi" function. However, I have two
> questions in regard to this:
>
> (1) I don't know what to use for "dev.resids". This may not
> matter for fitting. I can try a couple of different things to see if
> it matters.
>
> (2) Might someone else suggest something different, e.g., using
> something like optim to solve an appropriate quasi-score function?
>
> Thanks,
> spencer graves
>
Since nobody has answerd this I will try. The variance function mu+theta*mu^2 is the variance function
of the negative binomial family. If this variance function is used to construct a quasi-likelihood, the resulting quasilikelihood  is identical to the negative binomial likelihood, so for fitting we can simly use glm.nb from MASS, which will give the correct estimated values. However, in a quasi-likelihood setting the (co)varince estimation from
glm.nb is not appropriate, and from the book (fahrmeir ..) it seems that the estimation method used is a
sandwich estimator, so we can try the sandwich package. This works but the numerical results are somewhat different from the book. Any comments on this?

my code follows:

> library(Fahrmeir)
> library(help=Fahrmeir)
> library(MASS)
> cells.negbin <- glm(y~TNF+IFN+TNF:IFN, data=cells,
family=negative.binomial(1/0.215))
> summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215), data = cells)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.6714 -0.8301 -0.2153 0.4802 1.4282

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.39874495 0.18791125 18.087 4.5e-10 ***

TNF 0.01616136 0.00360569 4.482 0.00075 ***
IFN 0.00935690 0.00359010 2.606 0.02296 * 
TNF:IFN -0.00005910 0.00007002 -0.844 0.41515 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156 on 15 degrees of freedom
Residual deviance: 12.661 on 12 degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5


> confint(cells.negbin)
Waiting for profiling to be done... 2.5 % 97.5 % (Intercept) 3.0383197319 3.7890206510 TNF 0.0091335087 0.0238915483 IFN 0.0023292566 0.0170195707 TNF:IFN -0.0001996824 0.0000960427
> library(sandwich)
Loading required package: zoo
> vcovHC( cells.negbin )
(Intercept) TNF IFN TNF:IFN (Intercept) 0.01176249372 -0.0001279740135 -0.0001488223001 0.00000212541999 TNF -0.00012797401 0.0000039017282 0.0000021242875 -0.00000019793137 IFN -0.00014882230 0.0000021242875 0.0000054314079 -0.00000013277626 TNF:IFN 0.00000212542 -0.0000001979314 -0.0000001327763 0.00000002370104
> cov2cor(vcovHC( cells.negbin ))
(Intercept) TNF IFN TNF:IFN (Intercept) 1.0000000 -0.5973702 -0.5887923 0.1272950 TNF -0.5973702 1.0000000 0.4614542 -0.6508822 IFN -0.5887923 0.4614542 1.0000000 -0.3700671 TNF:IFN 0.1272950 -0.6508822 -0.3700671 1.0000000
> cells.negbin2 <- glm.nb( y~TNF+IFN+TNF:IFN, data=cells)
> summary(cells.negbin)
Call: glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215), data = cells) Deviance Residuals: Min 1Q Median 3Q Max -1.6714 -0.8301 -0.2153 0.4802 1.4282 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.39874495 0.18791125 18.087 4.5e-10 *** TNF 0.01616136 0.00360569 4.482 0.00075 *** IFN 0.00935690 0.00359010 2.606 0.02296 * TNF:IFN -0.00005910 0.00007002 -0.844 0.41515 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for Negative Binomial(4.6512) family taken to be 1.012271) Null deviance: 46.156 on 15 degrees of freedom Residual deviance: 12.661 on 12 degrees of freedom AIC: 155.49 Number of Fisher Scoring iterations: 5
> confint( cells.negbin2 )
Waiting for profiling to be done... 2.5 % 97.5 % (Intercept) 3.0864669072 3.73444363850 TNF 0.0100954189 0.02265622337 IFN 0.0032778815 0.01582969419 TNF:IFN -0.0001788579 0.00007142582
> library(lmtest)
> coeftest( cells.negbin2, vcov=vcovHC(cells.negbin2, type="HC1"), df=Inf)
z test of coefficients of "negbin" object 'cells.negbin2': Estimate Std. Error z value Pr(>|z|) (Intercept) 3.400428706 0.094904107 35.8302 < 2.2e-16 *** TNF 0.016130321 0.001213671 13.2905 < 2.2e-16 *** IFN 0.009333249 0.001632518 5.7171 0.00000001084 *** TNF:IFN -0.000058798 0.000019269 -3.0514 0.002278 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> coeftest( cells.negbin2, vcov=vcov(cells.negbin2), df=Inf)
z test of coefficients of "negbin" object 'cells.negbin2': Estimate Std. Error z value Pr(>|z|) (Intercept) 3.400428706 0.188480395 18.0413 < 2.2e-16 *** TNF 0.016130321 0.003573031 4.5145 0.000006348 *** IFN 0.009333249 0.003571163 2.6135 0.008962 ** TNF:IFN -0.000058798 0.000069162 -0.8501 0.395244 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Note different conclusions from the two last commands with respect to necessity of interaction term in model. Comments are welcome! Kjetil -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html --------------------------------- [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sun Jun 12 21:17:40 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:31 EST