**From:** Ajay Shah (*ajayshah@mayin.org*)

**Date:** Sun 16 May 2004 - 16:15:12 EST

**Next message:**Prof Brian Ripley: "Re: [R] importing text file with duplicate rows / indexing rows and columns"**Previous message:**Gabor Grothendieck: "Re: [R] How to restore and edit saved graphics?"**Next in thread:**Prof Brian Ripley: "Re: [R] Moving window regressions - how can I improve this code?"**Reply:**Prof Brian Ripley: "Re: [R] Moving window regressions - how can I improve this code?"**Reply:**Douglas Bates: "Re: [R] Moving window regressions - how can I improve this code?"**Maybe reply:**Gabor Grothendieck: "Re: [R] Moving window regressions - how can I improve this code?"

Message-id: <20040516061512.GR791@igidr.ac.in>

Prof. Bates, Gabor,

I had written a posting, some weeks ago, where I had written

fortrannish code for "moving window regressions" (also called 'rolling

window regression'), and asked how to do the code better. Both of you

had put much pain on this, and emerged with this great code:

data(women)

movingWindow2 <- function(formula, data, width, ...) {

mCall = match.call()

mCall$width = NULL

mCall[[1]] = as.name("lm")

mCall$x = mCall$y = TRUE

bigfit = eval(mCall, parent.frame())

ncoef = length(coef(bigfit))

nr = nrow(data)

width = as.integer(width)[1]

stopifnot(width >= ncoef, width <= nr)

y = bigfit$y

x = bigfit$x

terms = bigfit$terms

inds = embed(seq(nr), width)[, rev(seq(width))]

sumrys <- lapply(seq(nrow(inds)),

function(st) {

ind = inds[st,]

fit = lm.fit(x[ind,], y[ind])

fit$terms = terms

class(fit) = "lm"

summary(fit)

})

list(coefficients = sapply(sumrys, function(sm) coef(sm)[,"Estimate"]),

Std.Error = sapply(sumrys, function(sm) coef(sm)[,"Std. Error"]),

sigma = sapply(sumrys, "[[", "sigma"),

r.squared = sapply(sumrys, "[[", "r.squared"))

}

I have one piece of information, and one bugreport:

* I timed this, and compared it against my fortrannish code, and

this is roughly 2.5x faster :-)

* > junk = data.frame(x=runif(1000), y=runif(1000))
*

* > system.time(movingWindowRegression(women, 1000, 9, weight ~ height, 2))
*

[1] 20.07 0.01 20.80 0.00 0.00

* > system.time(movingWindow2(y ~ x, junk, 10))
*

[1] 8.27 0.03 8.43 0.00 0.00

(My notebook is a Celeron @ 500 Mhz).

* I find it breaks when I ask him to do a regression on 1:

* > r = movingWindowRegression(junk, 1000, 10, y ~ 1, 1)
*

* > movingWindow2(y ~ 1, junk, 10)
*

Error in lm.fit(x[ind, ], y[ind]) : `x' must be a matrix

I am in awe of the movingWindow2 code but don't quite know what to

tamper with! :-) Could you please help?

movingWindow2 should be renamed to movingWindow, and it should really

be in some standard CRAN package. I know a lot of people who do moving

window estimation all the time and this fits. It does moving window

regressions, and using the formula X ~ 1, it also gives you rolling

window means and standard deviations.

For anyone who reads this post, and is interested in moving window

regressions, please do go back into the list archives and follow this

thread. The comments and suggestions by everyone have been incredibly

insightful. For the sake of completeness, my old code is at EOF here -

you will find it useful only in case you want to run

movingWindowRegression in my bugdemo above.

As an aside, I noticed there are roll() and rollOLS() functions in the

(commercial) Finmetrics library that's sold with S+. Their prototypes

are :

rollOLS(formula, data, subset, na.rm=F, contrasts=NULL, start=NULL,

end=NULL, width=NULL, incr=1, tau=1e-10, trace=T, ...)

roll(FUNCTION, data, width, incr=1, start=NULL, end=NULL,

na.rm=F, save.list=NULL, arg.data="data", trace=T, ...)

here FUNCTION is the name of a S function for which rolling estimation

will be performed. The function must take an option "data".

Their examples are:

stack.dat = data.frame(Loss=stack.loss, stack.x)

roll("OLS", stack.dat, 7, formula=Loss~Air.Flow+Water.Temp, save.list="coef")

rollOLS(Loss~Water.Temp, data=stack.dat, width=6, incr=2)

In the case of rollOLS, they say that an efficient addition and

deletion algorithm is used for fast exection.

-- Ajay Shah Consultant ajayshah@mayin.org Department of Economic Affairs http://www.mayin.org/ajayshah Ministry of Finance, New Delhi# Using moving windows, we do an OLS regression. # # The function returns a matrix with T rows. So there are plenty of # "NA" values in it. The columns are organised as: # beta1 s1 beta2 s2 beta3 s3 sigma R^2 # i.e. we have each regression coefficient followed by it's sigma, # and then at the end we have the residual sigma and the regression R^2. movingWindowRegression <- function(data, T, width, model, K) { results = matrix(nrow=T, ncol=2*K+2) for (i in width:T) { details <- summary.lm(lm(model, data[(i-width+1):i,])) n=1; for (j in 1:K) { results[i, n:(n+1)] = details$coefficients[j, 1:2] n = n + 2 } results[i, n] = details$sigma results[i, n+1] = details$r.squared } return(results) }

______________________________________________ R-help@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

**Next message:**Prof Brian Ripley: "Re: [R] importing text file with duplicate rows / indexing rows and columns"**Previous message:**Gabor Grothendieck: "Re: [R] How to restore and edit saved graphics?"**Next in thread:**Prof Brian Ripley: "Re: [R] Moving window regressions - how can I improve this code?"**Reply:**Prof Brian Ripley: "Re: [R] Moving window regressions - how can I improve this code?"**Reply:**Douglas Bates: "Re: [R] Moving window regressions - how can I improve this code?"**Maybe reply:**Gabor Grothendieck: "Re: [R] Moving window regressions - how can I improve this code?"

*
This archive was generated by hypermail 2.1.3
: Mon 31 May 2004 - 23:05:11 EST
*