Re: [R] glm: offset

From: Ted Harding <Ted.Harding_at_manchester.ac.uk>
Date: Mon, 03 Mar 2008 07:51:35 +0000 (GMT)


On 03-Mar-08 03:19:01, Wensui Liu wrote:
> HI, John,
> my understanding is that you should use log(...) instead of its
> original scale. Below is the logic in the case of poisson reg.
> log(y / offset) = x'b
> => log(y) - log(offset) = x'b
> => log(y) = x'b + log(offset)

Well, this is where it gets interesting! The above statement of the "logic" begs the question (i.e. assumes the answer).

I would go according to the general interpretation of "offset" in LM and GLM modelling -- an "offset" is

  "a quantitative variable whose regression coefficient    is known to be 1"
  [McCullough and Nelder (1983) "Generalised Linear Models",     page 138]

Since the GLM for a Poisson regression with log link is to model

  L = log(mu) = a + b1*X1 + B2*X2 + ...

mu is the Poisson mean, and where X1, X2, ... are the raw (untransformed, unless you have other reasons for tranforming them prior to bringing them into the regression) explanatory variables, if X1 is the variable you wish to use as "offset" in the above sense then it should be used un-transformed. On this basis, the answer to John Sorkin's question should be: don't use log(NumUniPt), use NumUniPt.

There's a potential confusion here in that presumably "NumUniPt" may be a positive variable whose distribution in the data may be skew, i.e. the sort of variable that you may feel urged to take the log of before using it.

But that would be an "other reason" in the sense of my comment above.

After all, suppose "NumUniPt" denoted a variable in the data that could take negative values. Would you be happy to use log(NumUniPt) in that case?

Best wishes to all,
Ted.

> On Sun, Mar 2, 2008 at 10:01 PM, John Sorkin
> <jsorkin_at_grecc.umaryland.edu> wrote:

>> R 2.6.0
>>  Windows XP
>>
>>  A question about running a generalized linear model.
>>
>>  I am running a glm with
>>  (1) a poisson distribution and a log link:
>>    family=poisson(link = "log")
>>  and an offset.
>>  I would like to know if I should express the offset as the log of the
>>  offset value, i.e.
>>  offset=log(NumUniqPt)
>>  or as:
>>  offset=NumUniqPt
>>
>>  I suspect I need to use the log, bu t I can't find any discussion of
>>  this in MASS 1994 or on the man page for glm.
>>  Thanks
>>  John
>>
>>
>>  John Sorkin M.D., Ph.D.
>>  Chief, Biostatistics and Informatics
>>  University of Maryland School of Medicine Division of Gerontology
>>  Baltimore VA Medical Center
>>  10 North Greene Street
>>  GRECC (BT/18/GR)
>>  Baltimore, MD 21201-1524
>>  (Phone) 410-605-7119
>>  (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>>  Confidentiality Statement:
>>  This email message, including any attachments, is for
>>  th...{{dropped:6}}
>>
>>  ______________________________________________
>>  R-help_at_r-project.org mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide
>>  http://www.R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>>

>
>
>
> --
> ===============================
> WenSui Liu
> ChoicePoint Precision Marketing
> Phone: 678-893-9457
> Email : wensui.liu_at_choicepoint.com
> Blog : statcompute.spaces.live.com
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) <Ted.Harding_at_manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 03-Mar-08                                       Time: 07:51:32
------------------------------ XFMail ------------------------------

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 03 Mar 2008 - 07:54:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 04 Mar 2008 - 14:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive