Re: [R] glm gives t test sometimes, z test others. Why?

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Sun 05 Mar 2006 - 18:52:25 EST

First off, glm() does not report these at all. The summary() method reports 't' and 'z' ratios, not tests (although they can be interpreted a test statistics). That is important, for

  1. Use of summary() is optional. You could use drop1() or car's Anova() instead of summary to do a test, and use profile() rather than summary() to construct confidence intervals. (And these days I normally do, although a decade ago they could be too slow.)
  2. summary.glm() has a 'dispersion' parameter. If the dispersion is estimated this is labelled as a 't' ratio, otherwise as a 'z' ratio. The quoted p-value is from a reference Student t in the first case and a Normal in the second. So for a single glm() fit you may see either 't' or 'z' depending on how you use summary.glm().

BTW, summary.glm in S always labels these as 't values' (which they are not always) but does not report p values, something that seems to me to be wise. But I lost that argument for R in the 1990s.

On Sun, 5 Mar 2006, Paul Johnson wrote:

> I just ran example(glm) and happened to notice that models based on
> the Gamma distribution gives a t test, while the Poisson models give a
> z test. Why?
>
> Both are b/s.e., aren't they?

In your terminology below, bhat/e.s.e.(bhat), the first 'e' being for 'estimated' (which may or may not be part of your definition of 'standard error').

> I can't find documentation supporting the claim that the distribution
> is more like t in one case than another, except in the Gaussian case
> (where it really is t).

Hmm. Even in the Gaussian case it depends on whether the residual variance is estimated or assume known. summary.glm allows the estimation of Gaussian model with known signa^2 whereas summary.lm does not.

There is some support that where the dispersion is estimated, the reference t is more accurate than a Normal would be. I am not in my office, but believe you will find the arguments in McCullagh & Nelder (1989). Note though that for families other than the Gaussian the dispersion estimate is not the MLE and other estimates may be preferable.

> Aren't all of the others approximations based on the Wald idea that
>
> bhat^2
> ------------
> Var(bhat)
>
> is asymptotically Chi-square?

Not really, more that bhat - b is asymptotically or exactly normal with computable variance which in general depends on the unknown true parameters. So you have to replace your denominator by an estimate of it, and in general you increase the variability if you do not know the dispersion.

> And that sqrt(Chi-square) is Normal.

Hmm, Normal^2 is chisq_1, but the 1 is crucial.

>
> While I'm asking, I wonder if glm should report them at all. I've
> followed up on Prof Ripley's advice to read the Hauck & Donner article
> and the successors, and I'm persuaded that we ought to just use the
> likelihood ratio test to decide about individual parameters.
>
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sun Mar 05 19:06:29 2006

This archive was generated by hypermail 2.1.8 : Sun 05 Mar 2006 - 21:09:04 EST