Re: [R] Degrees of freedom in binomial glm

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Thu, 10 Apr 2008 22:20:34 +0100 (BST)

You don't have 168 observations - 2 of them have no data (Freq = 0).

On Thu, 10 Apr 2008, Giovanni Petris wrote:

>
> Hello,
>
> I am looking at the job satisfaction data below, from a problem in
> Agresti's book, and I am not sure where the degrees of freedom come
> from. The way I am fitting a binomial model, I have 168 observations,
> so in my understanding that should also be the number of fitted
> parameters in the saturated model. Since I have one intercept
> parameter, I was thinking to get 167 df for the Null model, but
> R tells me it's 165. Where does this number come from?
>
> Thanks in advance,
> Giovanni
>
>
>> ### Agresti, Problem 5.23
>> race <- c("White", "Other")
>> gender <- c("M", "F")
>> age <- c("<35", "35-44", ">44")
>> loc <- c("NE", "MidAtl", "S", "MidW", "NW", "SW", "Pac")
>> sat <- factor(c("Yes", "No"), levels = c("No", "Yes"))
>> Freq <- c(288, 60, 224, 35, 337, 70, 38, 19, 32, 22, 21, 15,
> + 177, 57, 166, 19, 172, 30, 33, 35, 11, 20, 8, 10,
> + 90, 19, 96, 12, 124, 17, 18, 13, 7, 0, 9, 1,
> + 45, 12, 42, 5, 39, 2, 6, 7, 2, 3, 2, 1,
> + 226, 88, 189, 44, 156, 70, 45, 47, 18, 13, 11, 9,
> + 128, 57, 117, 34, 73, 25, 31, 35, 3, 7, 2, 2,
> + 285, 110, 225, 53, 324, 60, 40, 66, 19, 25, 22, 11,
> + 179, 93, 141, 24, 140, 47, 25, 56, 11, 19, 2, 12,
> + 270, 176, 215, 80, 269, 110, 36, 25, 9, 11, 16, 4,
> + 180, 151, 108, 40, 136, 40, 20, 16, 7, 5, 3, 5,
> + 252, 97, 162, 47, 199, 62, 69, 45, 14, 8, 14, 2,
> + 126, 61, 72, 27, 93, 24, 27, 36, 7, 4, 5, 0,
> + 119, 62, 66, 20, 67, 25, 45, 22, 15, 10, 8, 6,
> + 58, 33, 20, 10, 21, 10, 16, 15, 10, 8, 6, 2)
>> satdata <- data.frame(Freq, expand.grid(gender=gender, age=age,
> + race=race, sat=sat, loc=loc))
>> sat.glm0 <- glm(sat ~ gender + age + race + loc, weights = Freq,
> + family = binomial, data = satdata)
>> summary(sat.glm0)
>
> Call:
> glm(formula = sat ~ gender + age + race + loc, family = binomial,
> data = satdata, weights = Freq)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -19.456 -6.839 0.000 6.309 17.635
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 0.334265 0.056491 5.917 3.28e-09 ***
> genderF -0.180480 0.047575 -3.794 0.000149 ***
> age35-44 0.122422 0.051836 2.362 0.018191 *
> age>44 0.361610 0.051576 7.011 2.36e-12 ***
> raceOther -0.005883 0.061605 -0.095 0.923919
> locMidAtl 0.437342 0.103821 4.212 2.53e-05 ***
> locS 0.178574 0.073033 2.445 0.014481 *
> locMidW 0.083189 0.066427 1.252 0.210449
> locNW 0.134337 0.067498 1.990 0.046563 *
> locSW 0.295874 0.073488 4.026 5.67e-05 ***
> locPac 0.425480 0.096561 4.406 1.05e-05 ***
> ---
> Signif. codes: 0 ??***?? 0.001 ??**?? 0.01 ??*?? 0.05 ??.?? 0.1 ?? ?? 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 12987 on 165 degrees of freedom
> Residual deviance: 12880 on 155 degrees of freedom
> AIC: 12902
>
> Number of Fisher Scoring iterations: 4
>
>> str(satdata)
> 'data.frame': 168 obs. of 6 variables:
> $ Freq : num 288 60 224 35 337 70 38 19 32 22 ...
> $ gender: Factor w/ 2 levels "M","F": 1 2 1 2 1 2 1 2 1 2 ...
> $ age : Factor w/ 3 levels "<35","35-44",..: 1 1 2 2 3 3 1 1 2 2 ...
> $ race : Factor w/ 2 levels "White","Other": 1 1 1 1 1 1 2 2 2 2 ...
> $ sat : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
> $ loc : Factor w/ 7 levels "NE","MidAtl",..: 1 1 1 1 1 1 1 1 1 1 ...
>> sessionInfo()
> R version 2.6.2 (2008-02-08)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.6.2
>>
>
> --
>
> Giovanni Petris <GPetris_at_uark.edu>
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

Received on Thu 10 Apr 2008 - 21:32:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 10 Apr 2008 - 22:30:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive