From: Bill Dunlap <bill_at_insightful.com>

Date: Mon, 14 Apr 2008 16:41:38 -0700 (PDT)

Bill Dunlap

Insightful Corporation

bill at insightful dot com

x

},

`f6` = # same as f5, but not compiled with jit.

},

`f7` =

}

)

expected <- list(

)

)

)))

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 14 Apr 2008 - 23:44:11 GMT

Date: Mon, 14 Apr 2008 16:41:38 -0700 (PDT)

On Mon, 14 Apr 2008, Stephen Milborrow wrote:

> > Le sam. 12 avr. à 12:47, carlos martinez a écrit :

*> > Looking for a simple, effective a minimum execution time solution.
**> >
**> > For a vector as:
**> >
**> > c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
**> >
**> > To transform it to the following vector without using any loops:
**> >
**> > (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
**>
**> Here is a fast solution using the Ra just-in-time compiler
**> www.milbo.users.sonic.net/ra.
**>
**> jit(1)
**> if (length(x) > 1)
**> for (i in 2:length(x))
**> if (x[i])
**> x[i] <- x[i-1] + 1
**>
**> The times in seconds for various solutions mailed to r-devel are listed
**> below. There is some variation between runs and with the contents of x. The
**> times shown are for
**>
**> set.seed(1066); x <- as.double(runif(1e6) > .5)
**>
**> This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R 2.6.2).
**> The code to generate these results is attached.
**>
**> vin 24
**> greg 11
**> had 3.9
**> dan 1.4
**> dan2 1.4
**> jit 0.25 # code is shown above, 7 secs with standard R 2.6.2>
*

Stephen's solution is certainly easy to read and write.

Another solution, if I understand the scope of the problem, is

f7 <- function(x){ tmp<-cumsum(x);tmp-cummax((!x)*tmp)}

I made a script to run all the functions I noticed (except for the library(inline) one) on various 0-1 vectors of length one million and got the following timings on R 2.6.2 running on my Windows laptop (Lenovo T61, Core 2 Duo at 2 GHz). "error thrown" means the function call died and "incorrect" means it returned the wrong answer.

> source("z:/dumpdata.R")

Timing stopped at: 0.05 0 0.05 NA NA

Error in dots[[1L]][[1L]] : subscript out of bounds
Timing stopped at: 0.14 0.06 0.21 NA NA
Error in x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n)) :

incompatible types (from NULL to double) in subassignment type fix all.ones all.zeros few.long many.short f1 7.02 7.03 7.04 7.03 f2 0.13 0.13 0.13 2.52 f3 error thrown 35.47 incorrect incorrect f4 0.19 error thrown 0.21 1.20 f5 0.28 0.09 0.29 0.18 f6 5.40 0.78 5.42 3.14 f7 0.06 0.05 0.06 0.06

I've attached the script so you can figure out whose function is whose if you care to. The lapply/mapply solution, f3, required that there be 1's at both ends of the input vector. Perhaps I miscopied the code.

Bill Dunlap

Insightful Corporation

bill at insightful dot com

"All statements in this message represent the opinions of the author and do not necessarily reflect Insightful Corporation policy or position."

The test script:

len <- 1e6 # length of vectors in tests

funs <- list(

`f1` = function(x)Reduce( function(x,y) x*y + y, x, accumulate=TRUE ),

`f2` = function(x)x * unlist(lapply(rle(x)$lengths, seq_len)),

`f3` = function(x){

ind <- which(x == 0)

unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, length(x))),

function(y) cumsum(x[y]))) },

`f4` = function(x) {

d <- diff(c(0,x,0))

starts <- which(d == 1)

ends <- which(d == -1)

x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
x },

`f5` = function(x) {

if (existsFunction("jit")) jit(1) else stop("no jit available") if (length(x) > 1)

for (i in 2:length(x)) if (x[i]) x[i] <- x[i-1] + 1

x

},

`f6` = # same as f5, but not compiled with jit.

function(x) {

if (existsFunction("jit")) jit(0) if (length(x) > 1) for (i in 2:length(x)) if (x[i]) x[i] <- x[i-1] + 1 x

},

`f7` =

function(x) {

tmp<-cumsum(x) tmp-cummax((!x)*tmp)

}

)

data <- list(

all.ones = rep(1, len), all.zeros = rep(0, len), few.long = rep( c(rep(1,len/10-1),0), len=len), # 10 long runsmany.short = rep(c(1,0), len=len) # len/2 runs of length 1 )

expected <- list(

all.ones = 1:len, all.zeros = rep(0, len), few.long = rep( c(1:(len/10-1), 0), len=len),many.short = rep(c(1,0), len=len)

)

print(noquote(sapply(names(data),

function(d) sapply(names(funs),

function(f){ time<-try(unix.time(gcFirst=TRUE, val<-funs[[f]](data[[d]]))) if (is(time, "try-error")) "error thrown" else if (!isTRUE(report <- all.equal(val, expected[[d]]))) "incorrect" else sprintf("%7.2f", unname(time[1]+time[2])) }

)

)))

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 14 Apr 2008 - 23:44:11 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Tue 15 Apr 2008 - 01:31:10 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*