Re: [R] Moving window regressions - how can I improve this code?

About this list Date view Thread view Subject view Author view Attachment view

From: Douglas Bates (bates@stat.wisc.edu)
Date: Sun 16 May 2004 - 20:51:39 EST


Message-id: <6r8yfsve4k.fsf@bates4.stat.wisc.edu>

Ajay Shah <ajayshah@mayin.org> writes:

> I had written a posting, some weeks ago, where I had written
> fortrannish code for "moving window regressions" (also called 'rolling
> window regression'), and asked how to do the code better. Both of you
> had put much pain on this, and emerged with this great code:
>
> data(women)
> movingWindow2 <- function(formula, data, width, ...) {
> mCall = match.call()
> mCall$width = NULL
> mCall[[1]] = as.name("lm")
> mCall$x = mCall$y = TRUE
> bigfit = eval(mCall, parent.frame())
> ncoef = length(coef(bigfit))
> nr = nrow(data)
> width = as.integer(width)[1]
> stopifnot(width >= ncoef, width <= nr)
> y = bigfit$y
> x = bigfit$x
> terms = bigfit$terms
> inds = embed(seq(nr), width)[, rev(seq(width))]
> sumrys <- lapply(seq(nrow(inds)),
> function(st) {
> ind = inds[st,]
> fit = lm.fit(x[ind,], y[ind])
> fit$terms = terms
> class(fit) = "lm"
> summary(fit)
> })
> list(coefficients = sapply(sumrys, function(sm) coef(sm)[,"Estimate"]),
> Std.Error = sapply(sumrys, function(sm) coef(sm)[,"Std. Error"]),
> sigma = sapply(sumrys, "[[", "sigma"),
> r.squared = sapply(sumrys, "[[", "r.squared"))
> }
>
> I have one piece of information, and one bugreport:
>
> * I timed this, and compared it against my fortrannish code, and
> this is roughly 2.5x faster :-)
>
> > junk = data.frame(x=runif(1000), y=runif(1000))
> > system.time(movingWindowRegression(women, 1000, 9, weight ~ height, 2))
> [1] 20.07 0.01 20.80 0.00 0.00
> > system.time(movingWindow2(y ~ x, junk, 10))
> [1] 8.27 0.03 8.43 0.00 0.00

You seem to be comparing timings on different data sets and models.
Did you mean to use junk and y ~ x in your first timing call?

> (My notebook is a Celeron @ 500 Mhz).
>
> * I find it breaks when I ask him to do a regression on 1:
>
> > r = movingWindowRegression(junk, 1000, 10, y ~ 1, 1)
> > movingWindow2(y ~ 1, junk, 10)
> Error in lm.fit(x[ind, ], y[ind]) : `x' must be a matrix

It looks like I made the common mistake of forgetting to add
drop=FALSE when extracting a subset of a matrix in a context where the
result must be a matrix. (With the default of drop = TRUE, dimensions
for which the range of the index is of length 1 are dropped.)

Try changing the call of lm.fit to
   lm.fit(x[ind, , drop = FALSE), y[ind])

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:11 EST