Re: [Rd] Any interest in "merge" and "by" implementations specifically for so

From: tshort <tshort_at_eprisolutions.com>
Date: Mon 31 Jul 2006 - 15:57:43 GMT

> Hi Tom,
>
> > Now, try sorting and using a loop:
> >
> >> idx <- order(i)
> >> xs <- x[idx]
> >> is <- i[idx]
> >> res <- array(NA, 1e6)
> >> idx <- which(diff(is) > 0)
> >> startidx <- c(1, idx+1)
> >> endidx <- c(idx, length(xs))
> >> f1 <- function(x, startidx, endidx, FUN = sum) {
> > + for (j in 1:length(res)) {
> > + res[j] <- FUN(x[startidx[j]:endidx[j]])
> > + }
> > + res
> > + }
> >> unix.time(res1 <- f1(xs, startidx, endidx))
> > [1] 6.86 0.00 7.04 NA NA
>
> I wonder how much time the sorting, reordering and creation os
> startidx and endidx would add to this time?

Done interactively, sorting and indexing seemed fast. Here are some timings:

> unix.time({idx <- order(i)

+            xs <- x[idx]
+            is <- i[idx]
+            res <- array(NA, 1e6)
+            idx <- which(diff(is) > 0)
+            startidx <- c(1, idx+1)
+            endidx <- c(idx, length(xs))
+          })

[1] 1.06 0.00 1.09 NA NA

> That looks interesting. Does it only work for specific operating
> systems and processors? I will give it a try.

No, as far as I know, it works on all operating systems. Also, it gets a little faster if you directly put the sum in the function:

> f4 <- function(x, startidx, endidx) {

+   for (j in 1:length(res)) {
+     res[j] <- sum(x[startidx[j]:endidx[j]])
+   }
+   res
+ }

> f5 <- cmpfun(f4)
> unix.time(res5 <- f5(xs, startidx, endidx))
[1] 2.67 0.03 2.95 NA NA
-- 
View this message in context: http://www.nabble.com/Any-interest-in-%22merge%22-and-%22by%22-implementations-specifically-for-sorted-data--tf2009595.html#a5578580
Sent from the R devel forum at Nabble.com.

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Tue Aug 01 02:02:09 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 31 Jul 2006 - 18:30:06 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.