Re: [Rd] function call overhead

From: Paul Johnson <pauljohn32_at_gmail.com>
Date: Mon, 28 Feb 2011 17:37:18 -0600

Snipping down to bare minimum history before comment:

On Wed, Feb 16, 2011 at 4:28 PM, Olaf Mersmann <olafm_at_statistik.tu-dortmund.de> wrote:
> Dear Hadly, dear list,
>
> On Wed, Feb 16, 2011 at 9:53 PM, Hadley Wickham <hadley_at_rice.edu> wrote:
>
>>> system.time(replicate(1e4, base::print))
>>   user  system elapsed
>>  0.539   0.001   0.541
>>> system.time(replicate(1e4, print))
>>   user  system elapsed
>>  0.013   0.000   0.012

>> library("microbenchmark")
>> res <- microbenchmark(print, base::print, times=10000)
>> res
>> print(res, unit="eps")
> Unit: evaluations per second
>                    min          lq      median          uq        max
> print       17543859.65 15384615.38 14705882.35 14492753.62 20665.8538
> base::print    23944.64    23064.33    22584.32    20659.88   210.5329
>

I think it is important to say that this slowdown is not unique to R and is unrelated to the fact that is R interpreted. The same happens in compiled object-oriented languages like C++ or Objective-C. There is an inherent cost in the runtime system to find a function or method that is suitable to an object.

In agent-based modeling simulations, we call it the cost of "method lookup" because the runtime system has to check for the existence of a method each time it is called for a given object. There is a time-saving approach where one can cache the result of the lookup and then call that result directly each time through the loop. Implementing this is pretty complicated, however, and it is discouraged unless you really need it. It is especially dangerous because this optimization throws-away the runtime benefit of matching the correct method to the class of the object. (See http://www.mulle-kybernetik.com/artikel/Optimization/opti-3.html, where it shows how one can even cache C library functions to avoid lookup overhead. I'm told that the Obj-C 2.0 runtime will try to optimize this automatically, I've not tested.)

The R solution is achieving that exact same kind of speed-up by saving the function lookup in a local variable. The R approach, however, is implemented much more easily than the Objective-C solution. There is an obvious danger: if the saved method is not appropriate to an object to which it applies, something unpredictable will happen.

The same is true in C++. I was fiddling around with the C++ code that is included with the R package Siena (awesome package, incidentally) last year and noticed a similar slowdown with method lookup. In C++, I was surprised to find a slowdown inside a class using an instance variable prefixed with "this.". For an IVAR, "this.x" and "x" are the same thing, but to the runtime system, well, there's slowdown in finding "this" class and getting x, compared to just using x. To the programmer who is trying to be clear and careful, putting "this." on the front of IVAR is tidy, but it also slows down the runtime a lot.

Hope this is not more confusing than when I started :)

pj

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 28 Feb 2011 - 23:39:57 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 Mar 2011 - 01:20:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive