From: David Winsemius <dwinsemius_at_comcast.net>

Date: Mon, 17 Mar 2008 01:11:43 +0000 (UTC)

Date: Mon, 17 Mar 2008 01:11:43 +0000 (UTC)

I am doing a reanalysis of results that have previously been published. My hope was to demonstrate the value of adoption of more modern regression methods in preference to the traditional approach of univariate stratification. I have encountered a puzzle regarding differences between I thought would be two equivalent analyses. Using a single factor, I compare poisson models with and without the intercept term. As expected, the estimated coefficient and std error of the estimate are the same for the intercept and the base level of the factor in the two models. The sum of the intercept with each coefficient is equal to the individual factor coefficients in the nointercept model. The overall model fit statistics are the same. However, the std errors for the other factors are much smaller in the model without the intercept.

> with(bmi, table(BMI,Actual_Deaths))

Actual_Deaths BMI 0 1 2 3 4 5 6 7 11 13 18.5-24.9 311 21 1 0 0 0 0 0 0 0 15.0-18.4 353 33 8 2 0 1 0 0 0 0 25.0-29.9 367 19 0 0 0 0 0 0 0 0 30.0-34.9 349 95 39 17 8 9 3 4 0 135.0-39.9 351 90 50 21 20 3 3 2 1 0 40.0-55.0 386 60 15 7 4 0 0 1 0 0

> bmi.base <- with(bmi,glm(Actual_Deaths ~

BMI + offset(log( MMI_VBT_Expected)), family="poisson"))

> summary(bmi.base)

Call:

glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)),

family = "poisson")

Deviance Residuals:

Min 1Q Median 3Q Max -2.6385 -0.5245 -0.2722 -0.1041 3.4426

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 0.42920 0.20851 2.058 0.0395 * BMI15.0-18.4 0.31608 0.24524 1.289 0.1974 BMI25.0-29.9 -0.22795 0.30999 -0.735 0.4621 BMI30.0-34.9 -0.09669 0.21506 -0.450 0.6530 BMI35.0-39.9 -0.04290 0.21455 -0.200 0.8415 BMI40.0-55.0 0.19348 0.22569 0.857 0.3913

--- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 1485.0 on 2654 degrees of freedom Residual deviance: 1470.0 on 2649 degrees of freedom AIC: 2760.9 Number of Fisher Scoring iterations: 6 -----------------------------------------------------Received on Mon 17 Mar 2008 - 01:14:31 GMT

> bmi.no.int <- with(bmi,glm(Actual_Deaths ~

BMI + offset(log(MMI_VBT_Expected)) -1 , family="poisson"))

> summary(bmi.no.int)

Call: glm(formula = Actual_Deaths ~ BMI + offset(log(MMI_VBT_Expected)) - 1, family = "poisson") Deviance Residuals: Min 1Q Median 3Q Max -2.6385 -0.5245 -0.2722 -0.1041 3.4426 Coefficients: Estimate Std. Error z value Pr(>|z|) BMI18.5-24.9 0.42920 0.20851 2.058 0.0395 * BMI15.0-18.4 0.74529 0.12910 5.773 7.79e-09 *** BMI25.0-29.9 0.20125 0.22939 0.877 0.3803 BMI30.0-34.9 0.33251 0.05270 6.309 2.81e-10 *** BMI35.0-39.9 0.38631 0.05057 7.639 2.19e-14 *** BMI40.0-55.0 0.62268 0.08639 7.208 5.67e-13 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for poisson family taken to be 1) Null deviance: 1630.7 on 2655 degrees of freedom Residual deviance: 1470.0 on 2649 degrees of freedom AIC: 2760.9 It does look statistically sensible that an estimate for BMI="40.0- 55.0" with over 100 events should have a much narrower CI than BMI="18.5-24.9" which only has 23 events. Is the model with an intercept term somehow "spreading around uncertainty" that really "belongs" to the reference category with its relatively low number of events? -- David Winsemius ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 17 Mar 2008 - 07:30:21 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*