Re: [R] overdispersion

From: John Maindonald <john.maindonald_at_anu.edu.au>
Date: Fri 12 Jan 2007 - 21:48:06 GMT

I would say rather that for binary data (binomial data with n=1) it is not possible to detect overdispersion from examination of the Pearson chi-square or the deviance. Overdispersion may be, and often is, nevertheless present. I am arguing that overdispersion is properly regarded as a function of the variance-covariance structure, not as a function of the sample data.

The variance of a two-point distribution is a known function of the mean, providing that independence and identity of distribution can be assumed, or providing that the correlation structure is otherwise known and the mean is constant. That proviso is crucial!

If there is some sort of grouping, it may be appropriate to aggregate data over the groups, yielding data that have a binomial form with n>1. Over-dispersion can now be detected from the Pearson chi-square or from the deviance. Note that the quasi models assume that the multiplier for the binomial or other variance is constant with p; that may or may not be realistic. Generalized linear mixed models make their own different assumptions about how the variance changes as a function of p; again these may or may not be realistic.

It is then the "error" structure that is crucial. To the extent that distracts from careful thinking about that structure, the term "overdispersion is unsatisfactory.

There's no obvious way that I can see to supply glm() with an estimate of the dispersion that has been derived independently of the current analysis. Especially in the binary case, this would sometimes be useful.

John Maindonald email: john.maindonald@anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.

On 12 Jan 2007, at 10:00 PM, r-help-request@stat.math.ethz.ch wrote:

> From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
> Date: 12 January 2007 5:04:26 AM
> To: evaiannario <evaiannario@libero.it>
> Cc: "r-help@stat.math.ethz.ch" <r-help@stat.math.ethz.ch>
> Subject: Re: [R] overdispersion
>
>
> evaiannario wrote:
>> How can I eliminate the overdispersion for binary data apart the
>> use of the quasibinomial?
> There is no such thing as overdispersion for binary data. (The
> variance of a two-point distribution is a known function of the
> mean.) If what you want to do is include random effects of some
> sort of grouping then you might look into generalized linear mixed
> models via lmer() from the lme4 package or glmmPQL from MASS.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Jan 13 09:05:54 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 13 Jan 2007 - 00:30:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.