Re: [R] efficient code. how to reduce running time?

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Mon 22 Jan 2007 - 16:06:12 GMT

On Mon, 22 Jan 2007, Charilaos Skiadas wrote:

> On Jan 21, 2007, at 8:11 PM, John Fox wrote:
>
>> Dear Haris,
>>
>> Using lapply() et al. may produce cleaner code, but it won't
>> necessarily
>> speed up a computation. For example:
>>
>>> X <- data.frame(matrix(rnorm(1000*1000), 1000, 1000))
>>> y <- rnorm(1000)
>>>
>>> mods <- as.list(1:1000)
>>> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
>> [1] 40.53 0.05 40.61 NA NA
>>>
>>> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
>> [1] 53.29 0.37 53.94 NA NA
>>
> Interesting, in my system the results are quite different:
>
> > system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
> [1] 192.035 12.601 797.094 0.000 0.000
> > system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
> [1] 59.913 9.918 289.030 0.000 0.000
>
> Regular MacOSX install with ~760MB memory.

But MacOS X is infamous for having rather specific speed problems with its malloc, and so gives different timing results from all other platforms. We are promised a solution in MacOS 10.5.

Both of your machines seem very slow compared to mine:

> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))

    user system elapsed
  11.011 0.250 11.311
> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))

    user system elapsed
  13.463 0.260 13.812

and that on a 64-bit platform (AMD64 Linux FC5).

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue Jan 23 03:20:51 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 22 Jan 2007 - 17:30:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.