On Mon, 14 Apr 2008, Stephen Milborrow wrote:
> > Le sam. 12 avr. à 12:47, carlos martinez a écrit :
> > Looking for a simple, effective a minimum execution time solution.
> >
> > For a vector as:
> >
> > c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >
> > To transform it to the following vector without using any loops:
> >
> > (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
>
> Here is a fast solution using the Ra just-in-time compiler
> www.milbo.users.sonic.net/ra.
>
> jit(1)
> if (length(x) > 1)
> for (i in 2:length(x))
> if (x[i])
> x[i] <- x[i-1] + 1
>
> The times in seconds for various solutions mailed to r-devel are listed
> below. There is some variation between runs and with the contents of x. The
> times shown are for
>
> set.seed(1066); x <- as.double(runif(1e6) > .5)
>
> This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R 2.6.2).
> The code to generate these results is attached.
>
> vin 24
> greg 11
> had 3.9
> dan 1.4
> dan2 1.4
> jit 0.25 # code is shown above, 7 secs with standard R 2.6.2>
Stephen's solution is certainly easy to read and write.
Another solution, if I understand the scope of the problem, is
f7 <- function(x){ tmp<-cumsum(x);tmp-cummax((!x)*tmp)}
I made a script to run all the functions I noticed (except for the library(inline) one) on various 0-1 vectors of length one million and got the following timings on R 2.6.2 running on my Windows laptop (Lenovo T61, Core 2 Duo at 2 GHz). "error thrown" means the function call died and "incorrect" means it returned the wrong answer.
> source("z:/dumpdata.R")
Timing stopped at: 0.05 0 0.05 NA NA
Error in dots[[1L]][[1L]] : subscript out of bounds
Timing stopped at: 0.14 0.06 0.21 NA NA
Error in x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n)) :
incompatible types (from NULL to double) in subassignment type fix
all.ones all.zeros few.long many.short
f1 7.02 7.03 7.04 7.03
f2 0.13 0.13 0.13 2.52
f3 error thrown 35.47 incorrect incorrect
f4 0.19 error thrown 0.21 1.20
f5 0.28 0.09 0.29 0.18
f6 5.40 0.78 5.42 3.14
f7 0.06 0.05 0.06 0.06
I've attached the script so you can figure out whose function is whose if you care to. The lapply/mapply solution, f3, required that there be 1's at both ends of the input vector. Perhaps I miscopied the code.
"All statements in this message represent the opinions of the author and do not necessarily reflect Insightful Corporation policy or position."
The test script:
len <- 1e6 # length of vectors in tests
funs <- list(
`f1` = function(x)Reduce( function(x,y) x*y + y, x, accumulate=TRUE ),
`f2` = function(x)x * unlist(lapply(rle(x)$lengths, seq_len)),
`f3` = function(x){
ind <- which(x == 0)
unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, length(x))),
function(y) cumsum(x[y]))) },
`f4` = function(x) {
d <- diff(c(0,x,0))
starts <- which(d == 1)
ends <- which(d == -1)
x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
x },
`f5` = function(x) {
if (existsFunction("jit")) jit(1) else stop("no jit available") if (length(x) > 1)
for (i in 2:length(x))
if (x[i])
x[i] <- x[i-1] + 1
function(x) {
if (existsFunction("jit")) jit(0)
if (length(x) > 1)
for (i in 2:length(x))
if (x[i])
x[i] <- x[i-1] + 1
x
function(x) {
tmp<-cumsum(x)
tmp-cummax((!x)*tmp)
data <- list(
all.ones = rep(1, len),
all.zeros = rep(0, len),
few.long = rep( c(rep(1,len/10-1),0), len=len), # 10 long runs
many.short = rep(c(1,0), len=len) # len/2 runs of length 1
)
all.ones = 1:len,
all.zeros = rep(0, len),
few.long = rep( c(1:(len/10-1), 0), len=len),
many.short = rep(c(1,0), len=len)
print(noquote(sapply(names(data),
function(d) sapply(names(funs),
function(f){
time<-try(unix.time(gcFirst=TRUE, val<-funs[[f]](data[[d]])))
if (is(time, "try-error"))
"error thrown"
else if (!isTRUE(report <- all.equal(val, expected[[d]])))
"incorrect"
else
sprintf("%7.2f", unname(time[1]+time[2]))
}
Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.
Archive generated by hypermail 2.2.0, at Tue 15 Apr 2008 - 01:31:10 GMT.
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.