[R] Pearson dispersion statistic

From: Smit, R. (Robin) <robin.smit_at_tno.nl>
Date: Thu 14 Jul 2005 - 17:26:41 EST

Thank you for your reply.  

I am aware of the good reasons not to use the deviance estimate in binomial, Poisson, and gamma families.

However, for the inverse Gaussian, the choice seems to me less clear cut. So I just wanted to compare two different options.  

I have used the dispersion parameter to compute the standardized deviance residuals:

summary(model.gamma)$deviance.resid /(summary(model.gamma)$dispersion * (1 - hatvalues(model.gamma)))^0.5  

I noticed differences with Genstat which outputs these stand. dev. residuals directly, and they are explained by the automatic use of deviance instead of Pearson.  

Kind regards,

Robin Smit    

-----Original Message-----

From: Prof Brian Ripley [mailto:ripley@stats.ox.ac.uk <mailto:ripley@stats.ox.ac.uk> ]

Sent: dinsdag 12 juli 2005 15:12

To: Smit, R. (Robin)

Cc: r-help@stat.math.ethz.ch

Subject: : Re: [R] Dispersion in glm (was (no subject))  

Actually, glm() does not estimate the dispersion at all, so you will need to be more specific.

For example, summary.glm() and predict.glm() use the Pearson statistic if dispersion=NULL (the default) for most families. You can supply any other value you choose, and the MASS package makes use of this for ML estimation of the dispersion parameter (related to the shape) of the gamma family.

There are rather good reasons (serious bias) not to use the deviance estimate in the binomial and Poisson families (see the example plots in MASS4), and good reasons not to use either in the gamma family. As the Pearson and deviance estimates agree for the gaussian, that does leave begging the question of why you want to do this. Further, McCullagh & Nelder have general arguments why the Pearson estimate might always be preferred to the deviance one. So that `another statastical package'

appears to need justification for its choice.  

On Mon, 11 Jul 2005, Smit, R. (Robin) wrote:

> The estimate of glm dispersion can be based on the deviance or on the

> Pearson statistic.

> I have compared output from R glm() to another statastical package and

> it appears that R uses the Pearson statistic.

> I was wondering if it is possible to make use R the deviance instead

> by modifying the glm(...) function?

> Thanks for your attention.


Brian D. Ripley, ripley@stats.ox.ac.uk

Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/

University of Oxford, Tel: +44 1865 272861 (self)

1 South Parks Road, +44 1865 272866 (PA)

Oxford OX1 3TG, UK Fax: +44 1865 272595


This e-mail and its contents are subject to the DISCLAIMER at http://www.tno.nl/disclaimer/email.html
	[[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Thu Jul 14 17:33:31 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:39 EST