RE: GLMs: show me the model!

From: <Bill.Venables_at_csiro.au>
Date: Fri, 20 Feb 2009 11:31:45 +1100

There has been a lot of work done on model checking, diagnostics and things like "goodness of link" tests, of course, (some would say too much). Good old google should deliver a swath of stuff, as usual (again, probably too much!).

Residuals, when properly defined, usually provide a good tool for this, too. However no single definition of residuals is universally optimal, and none really guarantees that you can think of them as "additive" or "multiplicative". They are simply quantities derived from the observations and the fitted model which, under the assumption that the model is correct, should behave as approximately iid gaussian, and this in turn provides a check on the original assumptions - of sorts.

For example, there are "deviance residuals", which have the property that their sum of squares is the deviance. These should be approximately iid gaussian (irrespective of the original distribution) if the modelling assumptions are correct. In turn this can be checked by normal scores plots and other devices. For some purposes "pearson" residuals are to be preferred for use in diagnostic devices. these have the property that their sum of squares is the Pearson Chi-squared test of goodness of fit for the model. And so on. there are four commonly used definitions, in fact, which all become the standard definition for the case of normal linear models. Take your pick, really.

One thing to point out pretty strongly is that in the case of binary data, a pretty common kind of response these days, no definition of resicuals has much use. Checking model assumptions there can be a tricky affair and lots of pretty unusual suggestions have been made at times for how to do it...

Bill Venables
http://www.cmis.csiro.au/bill.venables/

-----Original Message-----
From: Patrick Cordue [mailto:patrick.cordue_at_isl-solutions.co.nz]
Sent: Friday, 20 February 2009 10:14 AM
To: Venables, Bill (CMIS, Cleveland); anzstat_at_lists.uq.edu.au
Subject: RE: GLMs: show me the model!

Thanks Bill for a wonderful explanation and a bit of history (and you
weren't late - I was a bit quick off the mark).

In my defense (and that of other modellers seeking comfort): the reason I
want to know whether a particular GLM is consistent with additive or
multiplicative errors is because it can be a diagnostic as to whether the
model is plausible or not. I don't know how often it happens, but I have in
mind a situation where someone goes ahead, relatively blindly, and fits a
data set using glm() and comes up with the distribution and link function
which gives the "best fit". But is the model plausible given what is know
about the process from which the data were derived? If the process implies
multiplicative errors, yet the glm modelling delivers a distribution which
implies additive errors, then there is a problem.

Regards
Patrick

--
-----
Patrick Cordue
Director
Innovative Solutions Ltd
www.isl-solutions.co.nz
-----Original Message-----
From: owner-anzstat_at_lists.uq.edu.au [mailto:owner-anzstat_at_lists.uq.edu.au]On
Behalf Of Bill.Venables_at_csiro.au
Sent: Friday, February 20, 2009 12:36 PM
To: patrick.cordue_at_isl-solutions.co.nz; anzstat_at_lists.uq.edu.au
Subject: RE: GLMs: show me the model!
I'm coming late to this, but it is a topic that has arisen elsewhere many
times and not a few explanations have been faulty in the past (in particular
in the first edition of the GLIM manual, no less!).
Modellers often like to think of the model as consisting of a fixed part,
essentially deterministic, perturbed by errors.  In this they are comforted
and encouraged by the traditional way linear models are presented, i.e.
Y = X beta + E, where E ~ N(0, sigma^2 I)
This additive expression of how 'errors' come into the picture is not always
possible, however.  An alternative way of expressing this model is
Y ~ N(X beta, sigma^2 I)
where the stochasticity is embedded in the "N" part, without explicit
reference to additivity.  In this way the extension to, e.g. Poisson
loglinear models, is straightforward:
Y ~ Po(exp(X beta))
where the stocasticity is now expressed by the "Po", but this time ther is
no going backwards to an additive expression.  The "errors" are not
explicitly available.  This is still a fully specified model, however, and
you can define errors in any way you see fit, if that is your wish. e.g. e =
y - exp(X beta), e = log(y) - X beta (bit embarrassing if y = 0, though!)
&c.  However don't ask for the distribution of these errors, because it is
just not useful to do so.  Rather deal with the model as expressed in the
distributional form above.  This gives you all you need.
The generalization to generalized linear models is then straightforward
Y ~ f(invlink(X beta); theta)
where f describes the distributional family, X beta is the linear predictor,
invlink is the inverse of the link function and theta is any additional
parameter needed, like sigma^2.
Why the "inverse link" is used here rather than just the link function is an
accident of history, but an instructive one.  Before GLMs the tradition was
to transform the response to a scale in which something like an additive
linear model did apply.  The achievement of GLMs is really to switch the
transformation to the other side and apply it to the mean of the y instead,
rather than to y itself.  This is then the inverse link function, with the
link function itself still essentially the transformation that would have
been historically applied to the response.  Thus in the above Poisson
example the link is the log function, but the inverse link is the
exponential.  This way of doing things has many advantages, but one obvious
one is that it completely avoids any problem with zero observations, which
were a great embarrassment in the old transformation days and there was a
thriving cottage industry in how to deal with them.  In fact the idea of a
glm grew out of a throw-away remark of Fisher when in 1935 Bliss pushed him
for a solution to "the case of zero survivors" in probit analysis.
See
http://digital.library.adelaide.edu.au/dspace/bitstream/2440/15223/1/126.pdf
This tiny appendix is the birthplace of generalized linear modelling.  It
all grew from that.
Bill Venables.
Bill Venables
http://www.cmis.csiro.au/bill.venables/
-----Original Message-----
From: owner-anzstat_at_lists.uq.edu.au [mailto:owner-anzstat_at_lists.uq.edu.au]
On Behalf Of Patrick Cordue
Sent: Friday, 20 February 2009 9:00 AM
To: anzstat_at_lists.uq.edu.au
Subject: GLMs: show me the model!
I asked a question on GLMs a couple of days ago. In essence I was asking
"what is the model - please write it down - you, know, like for a linear
model: Y = a + bx + e, where e ~N(0,s^2) - can't we do that for a GLM?"
I come from a modelling background where the first step is to "write down
the model"; the second step is to look for tools which will provide
estimates of the unknown parameters; (I am assuming we already have a data
set). If my model is a GLM, then I can just use glm() in R. So, I wanted to
know the form of the GLM models for different families and link functions.
In particular, which implied simple additive errors (Y = mu + e) and which
implied simple multiplicative errors (Y = mu * e)?
(where mu = E(Y))
The answer provided by Murray Jorgensen is correct:
"In glms there is no simple characterisation of how the
systematic and random parts of the model combine to give you the data
(other than the definition of the glm, of course)."
Clearly for discrete distributions, it makes no sense to look for a
"building block" error e which can be added/multiplied to/by the expectation
to provide the response variable. My question was aimed at continuous
distributions.
Murray Smith (from NIWA) provided some useful comments (see below), which, I
think, get to the heart of my question.
However, I deduced the following results from first principles:
For the Gaussian family, Y = mu + e where e ~ N(0, s^2) (and E(Y) = mu =
m(eta) where eta is the linear combination of the explanatory/stimulus
variables, and m^-1 is the link function) is a GLM. I take this to imply
that when one fits a model using glm() with a Gaussian family and any link,
that the implied error structure is additive.
For the Gamma family, Y = mu * e where e ~ Gamma(k, 1/k) is a GLM. I take
this to imply that when one fits a model using glm() with a Gamma family and
any link, that the implied error structure is multiplicative.
For the inverse Gaussian family the implied model does not have a simple
additive or multiplicative error structure (someone might know how to write
down the model in this case - but not me).
Thanks to everyone who provided comments and references.
--------------------------------------
Murray H. Smith wrote:
"In most GLMs the error is neither multiplicative nor additive. Parameterize
the 1-parameter error family by the mean (fixing any dispersion or shape
parameters, which is what pure GLM is with the added constraint that the
error distribution belongs to a 1-parameter exponential family).
We can only write
       y ~ mu + e   or y ~ mu*e
for e not depending on mu, if mu is a location or scale parameter for the
error family. I.e.
      y ~ f( y;mu) where f(y;mu) = f(y - mu; mu =0)
or
      y ~ f( y;mu) where f(y;mu) =1/mu* f(y/mu; mu =1)
The variance function V(mu), the variance expressed as a function of the
mean, must be constant for an additive error and proportional to mu^2 for
multiplicative."
--
-----
Patrick Cordue
Director
Innovative Solutions Ltd
www.isl-solutions.co.nz
----
FOR INFORMATION ABOUT "ANZSTAT", INCLUDING UNSUBSCRIBING, PLEASE VISIT
http://www.maths.uq.edu.au/anzstat/
----
FOR INFORMATION ABOUT "ANZSTAT", INCLUDING UNSUBSCRIBING, PLEASE VISIT
http://www.maths.uq.edu.au/anzstat/
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.0.237 / Virus Database: 270.10.24/1954 - Release Date: 02/19/09
10:48:00
----
FOR INFORMATION ABOUT "ANZSTAT", INCLUDING UNSUBSCRIBING, PLEASE VISIT http://www.maths.uq.edu.au/anzstat/
Received on Fri Feb 20 2009 - 10:31:48 EST

This archive was generated by hypermail 2.2.0 : Thu Feb 26 2009 - 11:40:40 EST