[Rd] Rprof loses all system() time

From: Andrew Piskorski <atp_at_piskorski.com>
Date: Fri, 12 Jun 2009 17:31:48 -0400

Rprof seems to ignore all time spent inside system() calls. E.g., this simple example actually takes about 10 seconds, but Rprof thinks the total time is only 0.12 seconds:

> Rprof("sleep-system.out") ; system.time(system(command="sleep 10")) ; Rprof(NULL)

   user system elapsed
  0.000 0.004 10.015
> summaryRprof("sleep-system.out")$by.total

              total.time total.pct self.time self.pct
"gc"                0.12       100      0.12      100
"system.time"       0.12       100      0.00        0


I assume that what's going on here, is that anytime R blocks waiting for data from a child process, the profiler stops running entirley, and it then loses track of all that time spend blocking.

This Rprof behavior is frustrating, because I use a database access layer which essentially works by having R call system() to run a shell script. So, if I write a really slow query that takes 10 minutes to run, as for as Rprof is concerned, those 10 minutes never happened, the query consumed zero time. In complicated code, this makes it much harder to figure out what parts are slow due to slow R code vs. slow SQL queries.

Why does Rprof behave this way? Is there something I can do to work-around or alleviate this? What do you think it would take to fix Rprof to track the time spent waiting for system() to finish, and where in the R source should I look to attempt that?


Andrew Piskorski <atp_at_piskorski.com>

R-devel_at_r-project.org mailing list
Received on Fri 12 Jun 2009 - 21:35:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 12 Jun 2009 - 23:35:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive