Re: [R] glm or transformation of the response?

From: Peter Dalgaard <P.Dalgaard_at_biostat.ku.dk>
Date: Tue, 25 Nov 2008 16:55:04 +0100

Christoph Scherber wrote:
> Dear all,
>
> For an introductory course on glm?s I would like to create an example to
> show the difference between glm and transformation of the response. For
> this, I tried to create a dataset where the variance increases with the
> mean (as is the case in many ecological datasets):
>
> poissondata=data.frame(
> response=rpois(40,1:40),
> explanatory=1:40)
>
> attach(poissondata)
>
> However, I have run into a problem because it looks like the lm model
> (with sqrt-transformation) fits the data best:
>
> ##
>
> model1=lm(response~explanatory,poissondata)
> model2=lm(sqrt(response+0.5)~explanatory,poissondata)
> model3=lm(log(response+1)~explanatory,poissondata)
> model4=glm(response~explanatory,poissondata,family=poisson)
> model5=glm(response~explanatory,poissondata,family=quasipoisson)
> model6=glm.nb(response~explanatory,poissondata)
> model7=glm(response~explanatory,quasi(variance="mu",link="identity"))
>
>
> plot(explanatory,response,pch=16)
> lines(explanatory,predict(model1,explanatory=explanatory))
> lines(explanatory,(predict(model2,explanatory=explanatory))^2-0.5,lty=2)
> lines(explanatory,exp(predict(model3,explanatory=explanatory))-1,lty=3)
> lines(explanatory,exp(predict(model5,explanatory=explanatory)),lty=1,col="red")
>
> lines(explanatory,predict(model6,explanatory=explanatory,type="response"),lty=1,col="blue")
>
> lines(explanatory,predict(model7,explanatory=explanatory,type="response"),lty=1,col="green")
>
>
> ##
>
> The only model that performs equally well is model7.
>
> How would you deal with this kind of analysis? What would be your
> recommendation to the students, given the fact that most of the standard
> glm models obviously don?t seem to produce good fits here?
>
> Many thanks and best wishes
> Christoph
>
> (using R 2.8.0 on Windows XP)
>

Any good reason that you're not transforming both sides when transforming? and that you're not looking at

model8 <- glm(response~log(explanatory),poissondata,family=poisson) (etc.)

??

BTW, your predict call seems to be missing data.frame() around "explanatory=explanatory". The predict() methods do not have an argument called "explanatory", so this is just ignored (a buglet if you ask me).

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard_at_biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 25 Nov 2008 - 15:57:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 25 Nov 2008 - 17:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive