From: Peter Dalgaard <P.Dalgaard_at_biostat.ku.dk>

Date: Fri, 07 Dec 2007 15:14:52 +0100

Date: Fri, 07 Dec 2007 15:14:52 +0100

Bin Yue wrote:

> Dear all:

*> "predict.glm" provides an example to perform logistic regression when the
**> response variable is a tow-columned matrix. I find some paradox about the
**> degree of freedom .
**> > summary(budworm.lg)
**>
**> Call:
**> glm(formula = SF ~ sex * ldose, family = binomial)
**>
**> Deviance Residuals:
**> Min 1Q Median 3Q Max
**> -1.39849 -0.32094 -0.07592 0.38220 1.10375
**>
**> Coefficients:
**> Estimate Std. Error z value Pr(>|z|)
**> (Intercept) -2.9935 0.5527 -5.416 6.09e-08 ***
**> sexM 0.1750 0.7783 0.225 0.822
**> ldose 0.9060 0.1671 5.422 5.89e-08 ***
**> sexM:ldose 0.3529 0.2700 1.307 0.191
**> ---
**> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
**>
**> (Dispersion parameter for binomial family taken to be 1)
**>
**> Null deviance: 124.8756 on 11 degrees of freedom
**> Residual deviance: 4.9937 on 8 degrees of freedom
**> AIC: 43.104
**>
**> Number of Fisher Scoring iterations: 4
**>
**> This is the data set used in regression:
**> numdead numalive sex ldose
**> 1 1 19 M 0
**> 2 4 16 M 1
**> 3 9 11 M 2
**> 4 13 7 M 3
**> 5 18 2 M 4
**> 6 20 0 M 5
**> 7 0 20 F 0
**> 8 2 18 F 1
**> 9 6 14 F 2
**> 10 10 10 F 3
**> 11 12 8 F 4
**> 12 16 4 F 5
**>
**> The degree of freedom is 8. Each row in the example is thought to be
**> one observation. If I extend it to be a three column data.frame, the first
**> denoting the whether the individual is alive , the secode denoting the sex,
**> and the third "ldose",there will be 12*20=240 observations.
**> Since my data set is one of the second type , I wish to know whether
**> the form of data set affects the result of regression ,such as the degree of
**> freedom.
**> Dose anybody have any idea about this? Thank all who read this message.
**> Regards,
**> Bin Yue
**>
**>
*

Yes. Never use the deviance in binary logistic regression. Only use
differences in deviance between models, each of which satisfy
requirements for asymptotic theory (in your case, you could compare your
model with that described by sex*factor(ldose)). Another striking
example is this

y <- rbinom(1000, prob=.5, size=1)

summary(glm(y~-1,binomial))

now try it with different data

y <- rbinom(1000, prob=.01, size=1)

summary(glm(y~-1,binomial))

and think about it. Then consider the same thing with y~1.

As Brian keeps telling me, there IS a sense in which the residual deviances make sense in such cases, but it is not as a means of testing the model adequacy.

*> -----
*

> Best regards,

*> Bin Yue
**>
**> *************
**> student for a Master program in South Botanical Garden , CAS
**>
**>
*

-- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Fri 07 Dec 2007 - 14:18:45 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 07 Dec 2007 - 15:30:17 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*