Re: [R] Precision in R

From: <adelmaas_at_musc.edu>
Date: Fri 23 Jul 2004 - 00:02:28 EST

On 22 יול 2004, at 06:09, r-help-request@stat.math.ethz.ch wrote:

> Message: 5
> Date: Wed, 21 Jul 2004 13:48:53 +0200
> From: bhx2@mevik.net ( Bj?rn-Helge Mevik )
> Subject: Re: [R] Precision in R
> To: r-help@stat.math.ethz.ch
> Message-ID: <m0llhdbmxm.fsf@bar.nemo-project.org>
> Content-Type: text/plain; charset=iso-8859-1
>
> Since you didn't say anything about _what_ you did, either in SAS or
> R, my first thought was: Have you checked that you use the same
> parametrization of the models in R and SAS?

Well, I'm running Poisson regressions for the incidence of childhood acute lymphoblastic leukemia in a set of US counties (and in this data set, for some reason, Hawaii counts as an entire county). Separate models are calculated for males and females. Independent variable of interest are race ("white", "black", "other") and (in the model for males only) -log(proportion of people in county who moved between 1985 and 1990) (AKA "minus log proportion moved" or "MLPM").

SAS code:
> title "Males";
> proc genmod data=males order=formatted;
> class race sex;
> model observed = race mlpm*mlpm*mlpm mlpm*mlpm mlpm /
> dist=poisson link=log offset=lPYAR covb;
>
> run;
>
> title "Females";
> proc genmod data=females order=formatted;
> class race sex;
> model observed = race / dist=poisson link=log offset=lPYAR;
> run;

R code:
> Female.model <- glm(Observed ~ Black + Other, family =
> poisson(link=log), offset=log(PYAR), data=Females)
>
> Male.model <- glm(Observed ~ Black + Other +
> I(Minus.log.proportion.moved^3) + I(Minus.log.proportion.moved^2) +
> Minus.log.proportion.moved, family = poisson(link=log),
> offset=log(PYAR), data=Males)

The difference in how race is included in the models is due to me wanting both programs to use "whites" as the referent group (seeing as I have more data from them than "blacks" and "others").

SAS results:
> Males 12:08
> Wednesday, April 21, 2004 173
>
> The GENMOD Procedure
>
> Model Information
>
> Data Set WORK.MALES
> Distribution Poisson
> Link Function Log
> Dependent Variable Observed
> Offset Variable lPYAR
> Observations Used 526
>
>
> Class Level Information
>
> Class Levels Values
>
> Race 3 B O W
> Sex 1 M
>
>
> Parameter Information
>
> Parameter Effect Race
>
> Prm1 Intercept
> Prm2 Race B
> Prm3 Race O
> Prm4 Race W
> Prm5 mlPM*mlPM*mlPM
> Prm6 mlPM*mlPM
> Prm7 mlPM
>
>
> Criteria For Assessing Goodness Of Fit
>
> Criterion DF Value
> Value/DF
>
> Deviance 520 239.5025
> 0.4606
> Scaled Deviance 520 239.5025
> 0.4606
> Pearson Chi-Square 520 360.5677
> 0.6934
> Scaled Pearson X2 520 360.5677
> 0.6934
> Log Likelihood 320.5910
>
>
> Males 12:08
> Wednesday, April 21, 2004 174
>
> The GENMOD Procedure
>
> Algorithm converged.
>
>
> Estimated Covariance Matrix
>
> Prm1 Prm2 Prm3 Prm5
> Prm6 Prm7
>
> Prm1 9.25071 -0.01841 0.04877 -13.71192
> 37.88798 -33.20414
> Prm2 -0.01841 0.03392 0.002521 0.03045
> -0.07720 0.06191
> Prm3 0.04877 0.002521 0.02027 -0.07622
> 0.21457 -0.18748
> Prm5 -13.71192 0.03045 -0.07622 22.11044
> -59.26190 50.49281
> Prm6 37.88798 -0.07720 0.21457 -59.26190
> 160.70 -138.32
> Prm7 -33.20414 0.06191 -0.18748 50.49281
> -138.32 120.18
>
>
> Analysis Of Parameter Estimates
>
> Standard Wald 95% Confidence
> Chi-
> Parameter DF Estimate Error Limits
> Square Pr > ChiSq
>
> Intercept 1 -15.8294 3.0415 -21.7907 -9.8682
> 27.09 <.0001
> Race B 1 -0.6646 0.1842 -1.0256 -0.3036
> 13.02 0.0003
> Race O 1 -0.1058 0.1424 -0.3848 0.1733
> 0.55 0.4575
> Race W 0 0.0000 0.0000 0.0000 0.0000
> . .
> mlPM*mlPM*mlPM 1 15.4205 4.7022 6.2044 24.6366
> 10.75 0.0010
> mlPM*mlPM 1 -36.8423 12.6768 -61.6884 -11.9961
> 8.45 0.0037
> mlPM 1 27.2989 10.9627 5.8124 48.7855
> 6.20 0.0128
> Scale 0 1.0000 0.0000 1.0000 1.0000
>
> NOTE: The scale parameter was held fixed.
>
>
> Females 12:08
> Wednesday, April 21, 2004 175
>
> The GENMOD Procedure
>
> Model Information
>
> Data Set WORK.FEMALES
> Distribution Poisson
> Link Function Log
> Dependent Variable Observed
> Offset Variable lPYAR
> Observations Used 534
>
>
> Class Level Information
>
> Class Levels Values
>
> Race 3 B O W
> Sex 1 F
>
>
> Criteria For Assessing Goodness Of Fit
>
> Criterion DF Value
> Value/DF
>
> Deviance 531 245.2305
> 0.4618
> Scaled Deviance 531 245.2305
> 0.4618
> Pearson Chi-Square 531 484.8219
> 0.9130
> Scaled Pearson X2 531 484.8219
> 0.9130
> Log Likelihood 183.8640
>
>
> Algorithm converged.
>
>
> Analysis Of Parameter Estimates
>
> Standard Wald 95% Confidence
> Chi-
> Parameter DF Estimate Error Limits
> Square Pr > ChiSq
>
> Intercept 1 -9.7630 0.0577 -9.8762 -9.6499
> 28595.0 <.0001
> Race B 1 -1.0917 0.2493 -1.5803 -0.6030
> 19.17 <.0001
> Race O 1 0.0014 0.1569 -0.3061 0.3088
> 0.00 0.9931
> Race W 0 0.0000 0.0000 0.0000 0.0000
> . .
>
>
> Females 12:08
> Wednesday, April 21, 2004 176
>
> The GENMOD Procedure
>
> Analysis Of Parameter Estimates
>
> Standard Wald 95% Confidence
> Chi-
> Parameter DF Estimate Error Limits
> Square Pr > ChiSq
>
> Scale 0 1.0000 0.0000 1.0000 1.0000
>
> NOTE: The scale parameter was held fixed.

R results:
> > summary(Female.model)
>
> Call:
> glm(formula = Observed ~ Black + Other, family = poisson(link = log),
> data = Females, offset = log(PYAR))
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.4060 -0.5315 -0.1109 -0.0284 2.6520
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -9.763025 0.057735 -169.101 < 2e-16 ***
> BlackTRUE -1.091679 0.249309 -4.379 1.19e-05 ***
> OtherTRUE 0.001363 0.156876 0.009 0.993
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for poisson family taken to be 1)
>
> Null deviance: 272.49 on 533 degrees of freedom
> Residual deviance: 245.23 on 531 degrees of freedom
> AIC: 520.71
>
> Number of Fisher Scoring iterations: 7
>
> > summary(Male.model)
>
> Call:
> glm(formula = Observed ~ Black + Other +
> I(Minus.log.proportion.moved^3) +
> I(Minus.log.proportion.moved^2) + Minus.log.proportion.moved,
> family = poisson(link = log), data = Males, offset = log(PYAR))
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.24568 -0.49137 -0.10197 -0.03262 3.88346
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -16.39065 3.31644 -4.942 7.72e-07
> ***
> BlackTRUE -0.66461 0.18418 -3.608 0.000308
> ***
> OtherTRUE -0.09513 0.14278 -0.666 0.505245
> I(Minus.log.proportion.moved^3) 24.39920 7.51188 3.248 0.001162
> **
> I(Minus.log.proportion.moved^2) -51.17011 17.75857 -2.881 0.003959
> **
> Minus.log.proportion.moved 33.48773 13.52491 2.476 0.013286 *
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for poisson family taken to be 1)
>
> Null deviance: 278.68 on 525 degrees of freedom
> Residual deviance: 240.54 on 520 degrees of freedom
> AIC: 582.68
>
> Number of Fisher Scoring iterations: 6

Now, you'll notice (after scrolling up and down a lot) that the models for females have identical results, but the models for males have different results. Anybody have any ideas why I'm getting a difference and which program (if either) is giving me the right answer? Thanks in advance again.

Aaron



Aaron Solomon‭ (‬ben Saul Joseph‭) ‬Adelman E-mail‭: ‬adelmaas@musc.edu
Web site‭: ‬http‭://‬people.musc.edu‭/‬~adelmaas‭/‬ AOL Instant Messenger‭ & ‬Yahoo‭! ‬Messenger: ‬Hiergargo AIM chat-room (preferred): Adelmania

R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 23 00:10:26 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:55:11 EST