Re: [Rd] reference counting bug related to break and next in loops

From: William Dunlap <wdunlap_at_tibco.com>
Date: Wed, 10 Jun 2009 14:35:23 -0700

Thanks Luke.

I used codetools::walkCode to look for functions that returned the value of a loop and found a surprising number in base R. However, it looks like non of these functions were written to return anything useful (the side effects were important), so the change to loop-returns-NULL should let such functions use less memory (and time) and make them correspond better to their documentation.

E.g., fixup.libraries.URLs is not documented to return anything and it returns the value of for loop:

$`src/library/utils/R/windows/linkhtml.R`[[2]] function(lib.loc = .libPaths())
{

    for (lib in lib.loc) {

        if(lib == .Library) next
        pg <- sort(.packages(all.available = TRUE, lib.loc = lib))
        for(pkg in pg)
            try(fixup.package.URLs(file.path(lib,pkg)), silent = TRUE)
    }
}

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com

> -----Original Message-----
> From: luke_at_stat.uiowa.edu [mailto:luke_at_stat.uiowa.edu]
> Sent: Wednesday, June 10, 2009 1:05 PM
> To: William Dunlap
> Cc: r-devel_at_r-project.org
> Subject: Re: [Rd] reference counting bug related to break and
> next in loops
>
> Thanks for the report.
>
> It turns out that a similar issue arises in while() loops without
> break/next being involved because the test expression is evaluated
> after the final body evaluation. After some discussion we decided it
> was simplest both for implementation and documentation to have the
> value of a loop expression always be NULL. This is now implemented in
> R-devel.
>
> luke
>
> On Tue, 2 Jun 2009, William Dunlap wrote:
>
> > One of our R users here just showed me the following problem while
> > investigating the return value of a while loop. I added some
> > information
> > on a similar bug in for loops. I think he was using 2.9.0
> > but I see the same problem on today's development version of 2.10.0
> > (svn 48703).
> >
> > Should the semantics of while and for loops be changed
> slightly to avoid
> > the memory
> > buildup that fixing this to reflect the current docs would
> entail? S+'s
> > loops return nothing useful - that change was made long ago to avoid
> > memory buildup resulting from semantics akin the R's
> present semantics.
> >
> > Bill Dunlap
> > TIBCO Software Inc - Spotfire Division
> > wdunlap tibco.com
> >
> > --------------------Forwarded (and edited) message
> >
> below---------------------------------------------------------
> ----------
> > ----------
> >
> > I think I have found another reference counting bug.
> >
> > If you type in the following in R you get what I think is the wrong
> > result.
> >
> >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) {
> break }; i =
> > i + 1; y}; q
> > [1] 42 42 42 42 42 42 42 42 9 10
> >
> > I had expected [1] 42 42 42 42 42 42 42 8 9 10 which is
> what you get
> > if you add 0 to y in the last statement in the while loop:
> >
> >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) {
> break }; i =
> > i + 1; y + 0}; q
> > [1] 42 42 42 42 42 42 42 8 9 10
> >
> > Also,
> >
> >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
> > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration",
> i, "\n"); y};
> > q
> > Completing iteration 2
> > Completing iteration 3
> > [1] 42 42 42 42 42 42 42 42 9 10
> >
> > but if the last statement in the while loop is y+0 instead
> of y I get
> > the
> > expected result:
> >
> >> i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
> > i<-i+1 ; if (i<=8&&i>3)next ; cat("Completing iteration", i, "\n");
> > y+0L}; q
> > Completing iteration 2
> > Completing iteration 3
> > [1] 42 42 3 4 5 6 7 8 9 10
> >
> > A background to the problem is that in R a while-loop
> returns the value
> > of the last iteration. However there is an exception if an
> iteration is
> > terminated by a break or a next. Then the value is the value of the
> > previously completed iteration that did not execute a break or next.
> > Thus in an extreme case the value of the while may be the
> value of the
> > very first iteration even though it executed a million iterations.
> >
> > Thus to implement that correctly one needs to keep a
> reference to the
> > value of the last non-terminated iteration. It seems as if
> the current R
> > implementation does that but does not increase the reference counter
> > which explains the odd behavior.
> >
> > The for loop example is
> >
> >> z<-{ tmp<-rep(pi,10);for(i in 1:10){ tmp[i]<-i^2;if(i==9)break ; if
> > (i<9&&i>3)next ; tmp } }
> >> z
> > [1] 1.000000 4.000000 9.000000 16.000000 25.000000 36.000000
> > 49.000000
> > [8] 64.000000 81.000000 3.141593
> >> z<-{ tmp<-rep(pi,10);for(i in 1:10){ tmp[i]<-i^2;if(i==9)break ; if
> > (i<9&&i>3)next ; tmp+0 } }
> >> z
> > [1] 1.000000 4.000000 9.000000 3.141593 3.141593 3.141593 3.141593
> > 3.141593
> > [9] 3.141593 3.141593
> >
> > I can think of a couple of ways to solve this.
> >
> > 1. Increment the reference counter. This solves the
> bug but may
> > have serious performance implications. In the while example above it
> > needs to copy y in every iteration.
> >
> > 2. Change the semantics of while loops by getting rid of the
> > exception described above. When a loop is terminated with a
> break the
> > value of the loop would be NULL. Thus there is no need to keep a
> > reference to the value of the last non-terminated iteration.
> >
> > Any opinions?
> >
> > ______________________________________________
> > R-devel_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa Phone: 319-335-3386
> Department of Statistics and Fax: 319-335-3017
> Actuarial Science
> 241 Schaeffer Hall email: luke_at_stat.uiowa.edu
> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 10 Jun 2009 - 21:40:53 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 11 Jun 2009 - 12:35:46 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive