Re: [Rd] Speed of for loops

From: Herve Pages <hpages_at_fhcrc.org>
Date: Tue 30 Jan 2007 - 22:29:12 GMT

Tom McCallum wrote:
> Hi Everyone,
>
> I have a question about for loops. If you have something like:
>
> f <- function(x) {
> y <- rep(NA,10);
> for( i in 1:10 ) {
> if ( i > 3 ) {
> if ( is.na(y[i-3]) == FALSE ) {
> # some calculation F which depends on one or more of the previously
> generated values in the series
> y[i] = y[i-1]+x[i];
> } else {
> y[i] <- x[i];
> }
> }
> }
> y
> }
>
> e.g.
>

>> f(c(1,2,3,4,5,6,7,8,9,10,11,12));

> [1] NA NA NA 4 5 6 13 21 30 40
>
> is there a faster way to process this than with a 'for' loop? I have
> looked at lapply as well but I have read that lapply is no faster than a
> for loop and for my particular application it is easier to use a for loop.
> Also I have seen 'rle' which I think may help me but am not sure as I have
> only just come across it, any ideas?

Hi Tom,

In the general case, you need a loop in order to propagate calculations and their results across a vector.

In _your_ particular case however, it seems that all you are doing is a cumulative sum on x (at least this is what's happening for i >= 6). So you could do:

f2 <- function(x)
{

    offset <- 3
    start_propagate_at <- 6
    y_length <- 10
    init_range <- (offset+1):start_propagate_at     y <- rep(NA, offset)
    y[init_range] <- x[init_range]
    y[start_propagate_at:y_length] <- cumsum(x[start_propagate_at:y_length])     y
}

and it will return the same thing as your function 'f' (at least when 'x' doesn't contain NAs) but it's not faster :-/

IMO, using sapply for propagating calculations across a vector is not appropriate because:

  (1) It requires special care. For example, this:

        > x <- 1:10
        > sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]})

      doesn't work because the 'x' symbol on the left side of the <- in the
      anonymous function doesn't refer to the 'x' symbol defined in the global
      environment. So you need to use tricks like this:

        > sapply(2:length(x),
                 function(i) {x[i] <- x[i-1]+x[i]; assign("x", x, envir=.GlobalEnv); x[i]})

  (2) Because of this kind of tricks, then it is _very_ slow (about 10 times
      slower or more than a 'for' loop).

Cheers,
H.

>
> Many thanks
>
> Tom
>
>
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Jan 31 09:38:38 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 31 Jan 2007 - 00:31:16 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.