Re: [R] strange fluctuations in system.time with kernapply

From: Ravi Varadhan <rvaradhan_at_jhmi.edu>
Date: Mon, 02 May 2011 09:41:28 -0400

Why not do `zero padding' to improve the efficiency, i.e. add a bunch of zeros to the end of the data vector such that the resulting vector is a power of 2? This is very common in signal processing, and is legitimate since zero padding does not add any new information.

Ravi.



Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan_at_jhmi.edu

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Uwe Ligges Sent: Monday, May 02, 2011 5:31 AM
To: Alexander Senger
Cc: r-help_at_r-project.org
Subject: Re: [R] strange fluctuations in system.time with kernapply

On 29.04.2011 23:38, Alexander Senger wrote:
> Hello expeRts,
>
>
> here is something which strikes me as kind of odd and I would like to
> ask for some enlightenment:
>
> First let's do this:
>
> tkern <- kernel("modified.daniell", c(5,5))
> test <- rep(1,1000000)
> system.time(kernapply(test,tkern))

> User System verstrichen
> 1.100 0.040 1.136
>
> That was easy. Now this:
>
> test <- rep(1,1100000)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.40 0.02 1.43
>
> Still fine. Now this:
>
> test <- rep(1,1110000)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.390 0.020 1.409
>
> Ok, by now it seems boring. But wait:
>
> test <- rep(1,1110300)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 12.270 0.030 12.319
>
> There is a sudden - and repeatable! - jump in the time needed to execute
> kernapply. At least from a naive point of view there should not be much
> difference between applying a kernel to a vector 1110000 or 1110300
> entries long. But maybe there is some limit here?
>
> So I tried this:
>
> test <- rep(1,1110400)
> system.time(kernapply(test,tkern))
> User System verstrichen
> 1.96 0.01 1.97
>
> which doesn't fit into the pattern. But the best thing is still to come.
> When I try this
>
> test <- rep(1,1110308)
> system.time(kernapply(test,tkern))
>
> then the computer starts to run and does so for longer than 15 minutes
> until when I normally kill the process. As noted above this behaviour is
> repeatable and occurs every time I issue these commands.
>
> I really would like to know if there is some magic to the number 1110308
> I'm not aware of.

The magic is that the length of the vector, 1110308, is inefficient for the fft() used within kernapply(). You need integer powers of 2 for a really fast FFT.

You can also try smaller numbers to get longer runtimes, e.g.: 100003

As an example, compare:

system.time(fft(rep(1, 32768))) # roughly 0 seconds system.time(fft(rep(1, 32771))) # almost 10 seconds

Uwe Ligges

>
>
> Last but not least, here is my
>
> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
> [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C
> [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8
> [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
> [7] LC_PAPER=de_DE.utf8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.1

>
>
> Thank you,
>
> Alex
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 05 May 2011 - 06:25:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 05 May 2011 - 07:00:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive