Re: [R] Lengthy delay in sourcing a large function

From: Gregory. R. Warnes <greg_at_warnes.net>
Date: Mon, 02 Jun 2008 23:02:23 -0400

For such a large chunk of code, you might profit greatly from constructing an R package containing your functions. This will avoid the need to parse in the entire set every time, since the functions will be "pre-parsed" during the package installation.

-G

On Jun 2, 2008, at 5:45PM , Duncan Murdoch wrote:

> On 02/06/2008 5:28 PM, Dennis Fisher wrote:
> > Colleagues,
> >
> > I have a script that contains ~ 10,000 lines of code. Most of it is
> > written as small functions. However, for various reasons, the final
> > function is ~1500 lines of code. I realize that this may not be
> > optimal but the code evolved that way and breaking it into smaller
> > pieces is complicated because of the passing of arguments. I have
> > "cat(date())" statements at various places in the code so that I can
> > track the actions as the script is executed.
> >
> > I am running version 2.7.0 on a quad processor Mac and I call the
> > script from the OS: R --slave < Script.R
> >
> > It takes ~ 5 seconds for R to read the first 8000 lines of code (as
> > indicated by the time difference between the first record of the
> file
> > and the date issued immediately before the large function). Then,
> > reading the large function (1500 lines) takes ~ 1 minute. I have
> > improved the delay by moving some of the code from the large
> function.
> >
> > I don't understand why the second portion of the code is read so
> much
> > slower than the first. In that the code is a function, I presume
> that
> > nothing within the function is executed until the function is
> called.
> >
> > Does anyone have any experience with this issue?
>
> I haven't seen this sort of thing. I just wrote a (very simple and
> repetitive) 4000 line function and R read it in 4 seconds. I think
> you'll need to post the actual function somewhere to see if your one
> minute timing is reproducible.
>
> It's possible that it happens because R is short of memory, and
> needs to
> do a lot of swapping and garbage collection for the big function;
> trying
> to load that function and do nothing else except print the timings
> might
> be informative.
>
> Duncan Murdoch
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Gregory R. Warnes, Ph.D.
Associate Professor
Center for Biodefence Immune Modeling

    and
Department of Biostatistics and Computational Biology University of Rochester

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 03 Jun 2008 - 03:50:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 03 Jun 2008 - 04:30:35 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive