[R] strange timings in convolve(x,y,type="open")

From: Art Owen <owen_at_stanford.edu>
Date: Tue, 18 Dec 2007 16:58:08 -0800

Dear R-ophiles,

I've found something very odd when I apply convolve to ever larger vectors. Here is an example below with vectors ranging from 2^11 to 2^17. There is a funny bump up at 2^12. Then it gets very slow at 2^16.

> for( i in 11:20 )print( system.time(convolve(1:2^i,1:2^i,type="o")))
   user system elapsed
  0.002 0.000 0.002
   user system elapsed
  0.373 0.002 0.375
   user system elapsed
  0.014 0.001 0.016
   user system elapsed
  0.031 0.002 0.034
   user system elapsed
  0.126 0.004 0.130
   user system elapsed
194.095 0.013 194.185

   user system elapsed
  0.345 0.011 0.356

This example is run on a fedora machine with 64 bits. I hit the same wall at 2^16 on a Macbook (Intel processor I think). The fedora machine is running R 2.5.0. That's a bit old (April 07) but I saw no mention of this speed
problem in some web searching, and it's not mentioned in the 2.6 what's new notes.

I've rerun it and found the same bump at 12 and wall at 16. The timing at 2^16 can change appreciably. In one other case it was about 270 user, 271 elapsed. The 2^18 case ran for hours without ever finishing.

At first I thought that this was a memory latency issue. Maybe it is. But that makes it hard to explain why 2^17 works better than 2^16. I've seen that three times now, so I'm almost ready to call it reproducible.
Also, one of the machines I'm using has lots of memory. Maybe it's a cache issue ... but that still does not explain why 2^12 is slower than 2^13 or 2^16 is slower than 2^17.

I've checked by running convolve without type="o" and I don't see the wall. Similarly fft does not have that problem.

Here's an example without type="open"
> for( k in 11:20)print(system.time( convolve( 1:2^k,1:2^k)))
   user system elapsed
  0.001 0.000 0.000
   user system elapsed
  0.001 0.000 0.001
   user system elapsed
  0.002 0.000 0.002
   user system elapsed
  0.004 0.000 0.004
   user system elapsed
  0.009 0.001 0.010
   user system elapsed
  0.017 0.001 0.018
   user system elapsed
  0.138 0.005 0.143
   user system elapsed
  0.368 0.012 0.389
   user system elapsed
  1.010 0.032 1.051
   user system elapsed
  1.945 0.069 2.015

This is more what I expected. Something like N or N log(N) , with the difference hard to discern in granularity and noise.

The convolve function is not very big (see below). When type is not specified, it defaults to "circular". So my guess is that something mysterious might be happening inside the first else clause below, at least on some architectures.

-Art Owen

> convolve

function (x, y, conj = TRUE, type = c("circular", "open", "filter")) {

    type <- match.arg(type)
    n <- length(x)
    ny <- length(y)
    Real <- is.numeric(x) && is.numeric(y)     if (type == "circular") {

        if (ny != n)
            stop("length mismatch in convolution")
    else {
        n1 <- ny - 1
        x <- c(rep.int(0, n1), x)
        n <- length(y <- c(y, rep.int(0, n - 1)))
    x <- fft(fft(x) * (if (conj)
    else fft(y)), inv = TRUE)
    if (type == "filter")
        (if (Real)
        else x)[-c(1:n1, (n - n1 + 1):n)]/n
    else (if (Real)

    else x)/n

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed 19 Dec 2007 - 01:02:46 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 19 Dec 2007 - 07:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.