Re: [R] loop over large dataset

From: Federico Calboli <f.calboli_at_imperial.ac.uk>
Date: Tue 05 Jul 2005 - 00:22:37 EST

On 4 Jul 2005, at 15:15, Peter Dalgaard wrote:
>
> Your original code got lost in the threading, but that order of
> magnitude suggests that you have N^2/2 behaviour somewhere. The
> typical
> culprit is code like
>
> x <- numeric(0)
> for (i in 1:N){
> newx <- <<....>>
> x <- c(x, newx)
> }
>
> in which the extension of x causes the whole thing to be reallocated
> and copied. Same thing with cbind and rbind constructs of course.

I changed my code a bit, and now the runtime is dow to less than a minute (from more than 24 hours). I was copying a large dataset many times over, when I extracted the columns I need as independet vectors runtime dropped like a stone.

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue Jul 05 00:27:03 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:11 EST