From: <ripley_at_stats.ox.ac.uk>

Date: Tue 06 Dec 2005 - 16:10:33 GMT

Date: Tue 06 Dec 2005 - 16:10:33 GMT

Curiously, I was just looking at that, since I believe the answer should
be NaN, and some optimizing compilers/fast BLASes are not giving that.
(There's an example in reg-test-3.R.) So I think we need to return NaN
when hat is within rounding error of 1.

My take is that plot.lm should handle this: you will see most but not all cases have na.rm=TRUE in calculating ylim, but as Inf is theoretically impossible it has not been considered.

Note that plot.lm does not use rstandard and so needs a separate fix.

Thanks for the report

On Tue, 6 Dec 2005 Heather.Turner@warwick.ac.uk wrote:

> Full_Name: Heather Turner

*> Version: 2.2.0
**> OS: Windows XP
**> Submission from: (NULL) (137.205.240.44)
**>
**>
**> Standardized residuals as calculated by rstandard.lm, rstandard.glm and plot.lm
**> are Inf/NaN rather than zero when the un-standardized residuals are zero. This
**> causes plot.lm to break when calculating 'ylim' for any of the plots of
**> standardized residuals. Example:
**>
**> "occupationalStatus" <-
**> structure(as.integer(c(50, 16, 12, 11, 2, 12, 0, 0, 19, 40, 35,
**> 20, 8, 28, 6, 3, 26, 34, 65, 58, 12, 102, 19, 14, 8,
**> 18, 66, 110, 23, 162, 40, 32, 7, 11, 35, 40, 25, 90,
**> 21, 15, 11, 20, 88, 183, 46, 554, 158, 126, 6, 8,
**> 23,
**> 64, 28, 230, 143, 91, 2, 3, 21, 32, 12, 177, 71,
**> 106)
**> ), .Dim = as.integer(c(8, 8)), .Dimnames =
**> structure(list(origin = c("1", "2", "3", "4", "5", "6", "7",
**> "8"),
**> destination = c("1", "2", "3", "4", "5", "6", "7",
**> "8")), .Names = c("origin", "destination")),
**> class = "table")
**> Diag <- as.factor(diag(1:8))
**> Rscore <- scale(as.numeric(row(occupationalStatus)), scale = FALSE)
**> Cscore <- scale(as.numeric(col(occupationalStatus)), scale = FALSE)
**> Uniform <- glm(Freq ~ origin + destination + Diag +
**> Rscore:Cscore, family = poisson, data = occupationalStatus)
**> residuals(Uniform)[as.logical(diag(8))] #zero/near-zero
**> rstandard(Uniform)[as.logical(diag(8))] #mostly Inf/NaN
**> plot(Uniform) #breaks on qqnorm plot (or any 'which' > 1)
**>
**> This could be fixed by replacing standardized residuals with zero where the hat
**> value is one, e.g.
**> rstandard.glm <- function (model,
**> infl = lm.influence(model, do.coef = FALSE),
**> ...) {
**> res <- infl$wt.res
**> hat <- infl$hat
**> ifelse(hat == 1, 0, res/sqrt(summary(model)$dispersion * (1 -
**> infl$hat)))
**> }
**> etc.
**>
**> ______________________________________________
**> R-devel@r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-devel
**>
**>
*

-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Wed Dec 07 04:15:58 2005

*
This archive was generated by hypermail 2.1.8
: Wed 07 Dec 2005 - 14:21:33 GMT
*