From: Martin Maechler <maechler_at_stat.math.ethz.ch>

Date: Mon 31 Jul 2006 - 07:08:58 GMT

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jul 31 17:12:46 2006

Date: Mon 31 Jul 2006 - 07:08:58 GMT

>>>>> "RobCar" == Carnell, Rob C <CarnellR@BATTELLE.ORG> >>>>> on Sun, 30 Jul 2006 19:42:29 -0400 writes:

RobCar> NIST maintains a repository of Statistical Reference RobCar> Datasets at http://www.itl.nist.gov/div898/strd/. I RobCar> have been working through the datasets to compare RobCar> R's results to their references with the hope that RobCar> if all works well, this could become a validation RobCar> package. RobCar> All the linear regression datasets give results with RobCar> some degree of accuracy except one. The NIST model RobCar> includes 11 parameters, but R will not compute theRobCar> estimates for all 11 parameters because it finds the RobCar> data matrix to be singular.

RobCar> The code I used is below. Any help in getting R to RobCar> estimate all 11 regression parameters would be RobCar> greatly appreciated. RobCar> I am posting this to the R-devel list since I thinkRobCar> that the discussion might involve the limitations of RobCar> platform precision.

RobCar> I am using R 2.3.1 for Windows XP.

RobCar> rm(list=ls())

RobCar> require(gsubfn)

RobCar> defaultPath <- "my path"

RobCar> data.base <- "http://www.itl.nist.gov/div898/strd/lls/data/LINKS/DATA"

Here is a slight improvement {note the function file.path(); and
model <- ..; also poly(V2, 10) !}

which shows you how to tell lm() to "believe" in 10 digit
precision of input data.

reg.data <- paste(data.base, "/Filip.dat", sep="") filePath <- file.path(defaultPath, "NISTtest.dat") download.file(reg.data, filePath, quiet=TRUE)

A <- read.table(filePath, skip=60, strip.white=TRUE)

## If you really need high-order polynomial regression in S and R,
## *DO* as you are told in all good books, and use orthogonal polynomials:

(lm.ok <- lm(V1 ~ poly(V2,10), data = A))

## and there is no problem

summary(lm.ok)

## But if you insist on doing nonsense ....

model <- "V1 ~ V2+ I(V2^2)+I(V2^3)+I(V2^4)+I(V2^5)+I(V2^6)+I(V2^7)+I(V2^8)+I(V2^9)+I(V2^10)"

## MM: "better":

(model <- paste("V1 ~ V2", paste("+ I(V2^", 2:10, ")", sep='', collapse='')))

(form <- formula(model))

mod.mat <- model.matrix(form, data = A)

dim(mod.mat) ## 82 11

(m.qr <- qr(mod.mat ))$rank # -> 10 (only, instead of 11)

(m.qr <- qr(mod.mat, tol = 1e-10))$rank # -> 11

(lm.def <- lm(form, data = A)) ## last coef. is NA

(lm.plus <- lm(form, data = A, tol = 1e-10))## no NA coefficients

RobCar> reg.data <- paste(data.base, "/Filip.dat", sep="")

RobCar> model <- RobCar> "V1~V2+I(V2^2)+I(V2^3)+I(V2^4)+I(V2^5)+I(V2^6)+I(V2^7)+I(V2^8)+I(V2^9)+I RobCar> (V2^10)"

RobCar> filePath <- paste(defaultPath, "//NISTtest.dat", sep="") RobCar> download.file(reg.data, filePath, quiet=TRUE)

RobCar> A <- read.table(filePath, skip=60, strip.white=TRUE) RobCar> lm.data <- lm(formula(model), A)

RobCar> lm.data

RobCar> Rob Carnell

A propos NIST StRD:

If you go further to NONlinear regression,
and use nls(), you will see that high quality statistics
packages such as R do *NOT* always conform to NIST -- at least
not to what NIST did about 5 years ago when I last looked.
There are many nonlinear least squares problems where the
correct result is *NO CONVERGENCE* (because of
over-parametrization, ill-posednes, ...),
owever many (cr.p) pieces of software do "converge"---falsely.
I think you find more on this topic in the monograph of
Bates and Watts (1988), but in any case,
just install and use the CRAN R package 'NISTnls' by Doug Bates
which contains the data sets with documentation and example
calls.

Martin Maechler, ETH Zurich

R-devel@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Jul 31 17:12:46 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Mon 31 Jul 2006 - 12:27:17 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*