From: Murray Jorgensen <maj_at_waikato.ac.nz>

Date: Sat 22 Jul 2006 - 08:57:56 EST

>> G'day all,

*>>
*

*>>>>>>> "BDR" == Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:
*

*>>
*

*>> BDR> R does not know that poly(age,2) and poly(age,1) are linearly
*

*>> BDR> dependent.
*

*>> Indeed, I also thought that this is the reason of the problem.
*

*>>
*

*>> BDR> (And indeed they only are for some functions 'poly'.)
*

*>> I am surprised about this. Should probably read the help page of
*

*>> 'poly' once more and more carefully.
*

>> BDR> I cannot reproduce your example ('l' is missing), [...]

*>> My guess is that 'l' is 'pyears'. At least, I worked under that
*

*>> assumption.
*

>> Interestingly, on my machine (using R 2.3.1, 2.2.1 and 2.1.1) I cannot

*>> fit any of the Poisson GLM that Murray tried. I always get the error
*

*>> message:
*

*>>
*

*>> Error: no valid set of coefficients has been found: please supply
*

*>> starting values
*

*>>
*

>> But I have to investigate this further. I can fit binomial models

*>> that give me similar answers.
*

*>>
*

*>> BDR> [...] but perhaps
*

*>> BDR> glm(deaths ~ poly(age,2) + poly(age,1)*Smoke + offset(l),
*

*>> BDR> poisson)
*

*>> BDR> was your intention?
*

*>> In this parameterisation a 'poly(age,1)' term will appear among the
*

*>> coefficients with an estimated value of NA since it is aliased with
*

*>> 'poly(age, 2)1'. So I don't believe that this was Murray's intention.
*

*>>
*

*>> The only suggestion I can come up with is:
*

*>>
*

*>>> summary(glm(cbind(deaths, l-deaths) ~ age*Smoke+I(age^2),
*

*>>> family=binomial))
*

*>>
*

*>> [...]
*

*>>
*

*>> Coefficients:
*

*>> Estimate Std. Error z value Pr(>|z|)
*

*>> (Intercept) -10.79895 0.45149 -23.918 < 2e-16 ***
*

*>> age 2.37892 0.20877 11.395 < 2e-16 ***
*

*>> SmokeYes 1.44573 0.37347 3.871 0.000108 ***
*

*>> I(age^2) -0.19706 0.02749 -7.168 7.6e-13 ***
*

*>> age:SmokeYes -0.30850 0.09756 -3.162 0.001566 **
*

*>>
*

*>> [...]
*

*>>
*

*>> Which doesn't use orthogonal polynomials anymore. But I don't see how
*

*>> you can fit the model that Murray want to fit using orthogonal
*

*>> polynomials given the way R's model language operates.
*

*>>
*

*>> So I guess the Poisson GLM that Murray wants to fit is:
*

*>>
*

*>> glm(deaths~ age*Smoke+I(age^2)+offset(l), family=poisson)
*

*>>
*

*>> Cheers,
*

*>>
*

*>> Berwin
*

*>>
*

*>> ========================== Full address ============================
*

*>> Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr)
*

*>> School of Mathematics and Statistics +61 (8) 6488 3383 (self)
*

*>> The University of Western Australia FAX : +61 (8) 6488 1028
*

*>> 35 Stirling Highway
*

*>> Crawley WA 6009 e-mail: berwin@maths.uwa.edu.au
*

*>> Australia http://www.maths.uwa.edu.au/~berwin
*

*>>
*

*>>
*

>

Date: Sat 22 Jul 2006 - 08:57:56 EST

Thanks to Brian and Berwin with there help. I faced a double problem in that I not only wanted to fit the model but I also wanted to do so in such a way that it would seem natural for a classroom example.

The moral seems to be that I should do the orthogonal polynomial stuff outside the model formula. Here then is my solution:

pyears <- scan()

18793 52407 10673 43248 5710 28612 2585 12663 1462 5317

deaths <- scan()

2 32 12 104 28 206 28 186 31 102

l <- log(pyears)

Smoke <- gl(2,1,10,labels=c("No","Yes"))
Age <- gl(5,2,10,labels=c("35-44","45-54","55-64","65-74","75-84"),

ordered=TRUE)

mod1.glm <- glm(deaths ~ Age * Smoke + offset(l),family=poisson)
summary(mod1.glm)

age <- as.numeric(Age)

age1 <- poly(age,2)[,1]

age2 <- poly(age,2)[,2]

mod2.glm <- aso1.glm <- glm(deaths ~ age1 + age2 + Smoke +

age1:Smoke + offset(l),family=poisson)summary(mod2.glm)

The final summary then comes out looking like this:

Estimate Std. Error z value Pr(>|z|) (Intercept) -5.8368 0.1213 -48.103 < 2e-16 *** age1 5.3238 0.4129 12.893 < 2e-16 *** age2 -1.0460 0.1448 -7.223 5.08e-13 *** SmokeYes 0.5183 0.1262 4.106 4.02e-05 *** age1:SmokeYes -1.3755 0.4340 -3.169 0.00153 **

which is just what I wanted.

Cheers, Murray Jorgensen

Prof Brian D Ripley wrote:

> On Fri, 21 Jul 2006, Berwin A Turlach wrote: >

>> G'day all,

> > My point was perhaps simpler: if you or I or Murray had a function > poly() in our workspace, that one would be found, I think. (I have not > checked the ramifications of namespaces here, but I believe that would > be the intention, that formulae are evaluated in their defining > environment.) So omly when the model matrix is set up could the linear > dependence be known (and there is nothing in the system poly() to tell > model.matrix). > > What is sometimes called extrinsic aliasing is left to the fitting > function, which seems to be to do a sensible job provided the main > effect is in the model. Indeed, including interactions without main > effects (as Murray did) often makes the results hard to interpret. >

>> BDR> I cannot reproduce your example ('l' is missing), [...]

> > Apparently l = log(pyears), which was my uncertain guess. >

>> Interestingly, on my machine (using R 2.3.1, 2.2.1 and 2.1.1) I cannot

> > Related to the offset, I believe. >

>> But I have to investigate this further. I can fit binomial models

>

-- Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html Department of Statistics, University of Waikato, Hamilton, New Zealand Email: maj@waikato.ac.nz Fax 7 838 4155 Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 1395 862 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Sat Jul 22 09:06:26 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Sat 22 Jul 2006 - 10:15:15 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*