Re: [R] recursive beta with cutoffs on large data set

From: Dirk Eddelbuettel <edd_at_debian.org>
Date: Sun, 15 Jun 2008 12:24:54 -0500

Ivo,

On 15 June 2008 at 12:50, ivo welch wrote:
| dear R experts: I have an academic question that borders on asking
| for consulting help, so I hope I am not too imposing. If I am, please
| ignore me.
|
| My data set has 100MB data set of daily stock returns. I want to
| compute rolling (recursive?) betas---either bivariate or
| multivariate---with respect to some other data time series. Many of
| these regressions are "take away the first observation, add one
| observation at the end," which means I really have only about 30,000
| unique regressions---still, quite a good number. Worse, I want to
| winsorize the rolling y-vector at different levels (99%&1%, 98%&2%,
| ...), so I want to repeat this procedure a few hundred times at
| different winsorization levels.
|
| The most important version of my task is bivariate regressions, which
| may mean that I don't even need MV overhead.
|
| I was even thinking of coding in C rather than R for speed sake, but I
| am now thinking that learning the intricacies of fast vector
| processing on x86 processors is so difficult, I would be done running
| in R faster before I would be done programming it.
|
| Has anyone done something like this? Any recommendations for what
| could help give me high-speed the I probably need for a task like
| this? Any thoughts?

See    

    help(lm)

which says

    'lm' calls the lower level functions 'lm.fit', etc, see below, for     the actual numerical computations. For programming only, you may     consider doing likewise.

suggesting lm.fit for these types of bare-bones regressions from R (eg in the context of bootstraps or extended simulations).

You have to think about where your bottlenecks really are. Maybe it is in the data preparation and setup with all your rolling winsorized setups. If that is the case, I'd stay in R. Otherwise, interface an OLS function from Lapack etc is not too hard from C/C++ and you even get plenty of examples in the R sources.

| (I am right now working on getting blas-atlas to compile on my gentoo
| system. It just died in the compilation over something.)

[ On Debian, it has only been an 'apt-get install' away for almost six years now. Similarly, Ubuntu has Atlas-enabled R ever since it started. ]

Hth, Dirk

-- 
Three out of two people have difficulties with fractions.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 15 Jun 2008 - 17:29:22 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 15 Jun 2008 - 17:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive