From: Rune Haubo <rune.haubo_at_gmail.com>

Date: Mon, 30 Jun 2008 16:13:00 +0200

Date: Mon, 30 Jun 2008 16:13:00 +0200

Hi Vito

(question to the authors of MASS below)

2008/6/30 vito muggeo <vmuggeo_at_dssm.unipa.it>:

*> Dear Haubo,
**> many thanks for your reply.
**> Yes you are right, by scaling the "score", I get the same results.
**>
**> However it sounds strange to me. I understand that the SE and/or t-ratio of
**> the intercepts depend on the scale of the explanatory variable, but I do not
**> understand why the t-ratio of the slope also depends on the scale.. In
**> classical regression models (LM, GLM, ...) the scale of the covariate does
**> not affect the t ratio! As far as I know, classical references (e.g.
**> Agresti, 2002) do not discuss a such point. Could yu suggest me additional
**> reference? Namely why should I center the explanatory variables?
*

The wrong standard errors is a computational issue. Centering is a statistical and (sometimes) a computational issue.

The t-value is just the ratio of the estimate to its standard error,
and when the program does not give you the correct standard error,
naturally the t-value is also wrong. It is often a good idea to center
covariates for inferential purposes and a very good reference is
Venables' "Exegeses on Linear Models":

http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf.

I find it natural to expect polr to give the correct standard error without centering the covariate. I do not know why it does not, but perhaps is has something to do with the scaling of the parameters parsed to the BFGS-optimizer in optim. Optimization of the intercepts seems to be performed on something like

summary(fm1 <- polr(factor(grade)~score)) cumsum(c(fm1$zeta[1], exp(fm1$zeta[-1])))

which seems to distort the whole variance-covariance matrix of the parameters and therefore provides wrong standard errors for the score-parameter as well. Optimization is much better scaled if you do

summary(fm2 <- polr(factor(grade)~1 + I(score - median(score)))) cumsum(c(fm2$zeta[1], exp(fm2$zeta[-1])))

This is merely a guess, and perhaps the authors of MASS, could provide some clarification?

One other thing, that puzzles me, is the fact that the parameter estimate/standard error ratio is reported as a t-value rather than a z-value. Likelihood theory (as I know it) seems to suggest a z-value rather than the t-value. A t-value is also provided by the special case; the ordinary logistic regression model:

summary(glm(grade > 3 ~ score, family = binomial))

as well as by lrm in package Design.

*>
**> Of course profile-Lik based CI may be very usefuls at this aim..but this is
**> another story..
*

I agree, but the topic is closely related to standard errors ;-)

Best

Rune

*>
**> many thanks again,
**> vito
**>
**> Rune Haubo ha scritto:
**>>
*

>> Dear Vito

*>>
**>> No, you are not wrong, but you should center score prior to model
**>> estimation:
**>>
**>> summary(fm1 <- polr(factor(grade)~I(score - mean(score))))
**>>
**>> which gives the same standard errors as do lrm. Now the intercepts
**>> refer the median score rather than some potential unrealistic score of
**>> 0.
**>>
**>> You can always check if standard errors are appropriate by plotting
**>> the profile likelihood and the normal approximation (based on the
**>> standard errors):
**>>
**>> x <- seq(-.043 - .04, -.043 + .04, len = 20)
**>> dev <- double(20)
**>> for(i in 1:20)
**>> dev[i] <- polr(factor(grade) ~ 1 + offset(x[i] * (score -
**>> median(score))))$deviance
**>>
**>> yy <- spline(x, dev, n = 100, method = "natural")
**>> plot(yy[[1]], exp(-(yy[[2]] - min(yy[[2]]))/2), type = "l",
**>> xlab = "Score", ylab = "Normalized profile likelihood")
**>> ## Add Normal approximation:
**>> lines(yy[[1]], exp(-(yy[[1]] - coef(fm1))^2 /
**>> (2 * vcov(fm1)[1, 1])), lty = 2,
**>> col = "red")
**>>
**>> ## Add confidence limits:
**>> level <- c(0.95, 0.99)
**>> lim <- sapply(level, function(x)
**>> exp(-qchisq(x, df=1)/2) )
**>> abline(h = lim, col = "grey")
**>>
**>> ## Add confidence interval points:
**>> points(confint(fm1), rep(lim[1], 2), pch = 3)
**>>
**>> So these standard errors are the most appropriate you can get, but
**>> because the profile likelihood is not symmetric, you could consider
**>> reporting the asymmetric profile likelihood confidence interval
**>> instead.
**>>
**>> Hope this helps
**>> Rune
**>>
**>> 2008/6/30 vito muggeo <vmuggeo_at_dssm.unipa.it>:
**>>>
**>>> Dear all,
**>>> It appears that MASS::polr() and Design::lrm() return the same point
**>>> estimates but different st.errs when fitting proportional odds models,
**>>>
**>>> grade<-c(4,4,2,4,3,2,3,1,3,3,2,2,3,3,2,4,2,4,5,2,1,4,1,2,5,3,4,2,2,1)
**>>>
**>>> score<-c(525,533,545,582,581,576,572,609,559,543,576,525,574,582,574,471,595,
**>>> 557,557,584,599,517,649,584,463,591,488,563,553,549)
**>>>
**>>> library(MASS)
**>>> library(Design)
**>>>
**>>> summary(polr(factor(grade)~score))
**>>> lrm(factor(grade)~score)
**>>>
**>>> It seems to me that results from lrm() are right..
**>>>
**>>> Am I wrong?
**>>> Thanks,
**>>> vito
**>>>
**>>> --
**>>> ====================================
**>>> Vito M.R. Muggeo
**>>> Dip.to Sc Statist e Matem `Vianelli'
**>>> Università di Palermo
**>>> viale delle Scienze, edificio 13
**>>> 90128 Palermo - ITALY
**>>> tel: 091 6626240
**>>> fax: 091 485726/485612
**>>> http://dssm.unipa.it/vmuggeo
**>>>
**>>> ______________________________________________
**>>> R-help_at_r-project.org mailing list
**>>> https://stat.ethz.ch/mailman/listinfo/r-help
**>>> PLEASE do read the posting guide
**>>> http://www.R-project.org/posting-guide.html
**>>> and provide commented, minimal, self-contained, reproducible code.
**>>>
**>>
**>>
**>
**> --
**> ====================================
**> Vito M.R. Muggeo
**> Dip.to Sc Statist e Matem `Vianelli'
**> Università di Palermo
**> viale delle Scienze, edificio 13
**> 90128 Palermo - ITALY
**> tel: 091 6626240
**> fax: 091 485726/485612
**> http://dssm.unipa.it/vmuggeo
**> ====================================
**>
*

-- Rune Haubo Bojesen Christensen Master Student, M.Sc. Phone: (+45) 30 26 45 54 Informatics and Mathematical Modelling Section for Statistics Technical University of Denmark, Build.321, DK-2800 Kgs. Lyngby, Denmark ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon 30 Jun 2008 - 14:15:52 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 30 Jun 2008 - 14:30:50 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*