R-alpha: frames in R and S

Luke Tierney (luke@stat.umn.edu)
Thu, 30 May 1996 15:50:48 -0500 (CDT)


From: Luke Tierney <luke@stat.umn.edu>
Message-Id: <9605302050.AA04399@nokomis.stat.umn.edu>
Subject: R-alpha: frames in R and S
To: luke@nokomis.stat.umn.edu (Luke Tierney)
Date: Thu, 30 May 1996 15:50:48 -0500 (CDT)
In-Reply-To: <9605301723.AA04092@nokomis.stat.umn.edu> from "Luke Tierney" at May 30, 96 12:23:57 pm

At the end of an earlier long note I promised to rant about the evils
of frames, so here goes :-)

S allows explicit access to the evaluation frames. In particular,

	1) You can examine frames via sys.frames and friends, and

		a) you can access a variable's value in a specified frame
		with get

		b) you can change a variable's value in a specified frame
		with assign.

	2) You can change the structure of frames:

		a) You can add a new variable to a specified frame with
		assign

		b) you can remove a variable from a frame with remove

Let's look at these in turn.

1) Being able to examine and change variable values is very useful in
debugging.  The fact that you can make these changes from the S
language means that you can write useful debuggers in S itself -- like
the S browser.

It also has some uses outside of debugging, but I would argue that
these uses are more limited, or perhaps it would be better to say more
structured. I believe there are three main uses for frames in programs

	substitute(x) for an argument x to get the unevaluated form of

	eval(expr) to evaluate an expression in the current frame

	eval(expr,sys.parent()) to evaluate an expression in the caller's
	frame

Does anyone else have any other examples?

These three cases can be handled by access in every function to the
current frame (and call) and the parent frame (and call).

It would be useful if at least in code that does not need to be
debugged one could eliminate access to other frames, and perhaps even
restructure these two so that access must go via a more restricted
mechanism, e.g. sys.frame() and sys.parent.frame(), each with no
arguments.

There are two reasons for wanting to limit frame access:

INLINING: The two functions

	f1<-function(x) { g<-function(x) h(x); g(x) }
	f2<-function(x) { h(x) }

would be equivalent semantically *if* the calls to h didn't see
different frame structures. In the presence of possible frame access
one can only inline g (i.e. replace f1 by f2) if h is known not to
access frames.

TAIL RECURSION: When the last thing f does is call g, as in

	f<-function(x) {....; g(x) }

if frames do not need to maintained then one does not need to allocate
a new frame for g -- one can reuse the one for f since that one won't
be needed any more. This is an optimization that can make recursive
functions much more efficient.

Neither of these issues is a show stopper, but some restrictions on
frame access in production code could help to produce somewhat faster
code.

On the othr hand, if frames must be available, they could be rather
more powerful than they are: conceptually at least they could be used
to implement first-class continuations. That is a whole can of worms
in its own right, but could be useful in a multi-threaded environment.
Enough of that.

2) This one is a much bigger issue. Look at the functions

	f<-function() { h(); y}
	g<-function(y) { h(); y}

Is y a local or a global variable in these functions? It depends on
what h does. For almost any h, y is global in f and local in g. But for

	h<-function() assign("y", 1, frame=sys.parent())
	[h<-function() assign("y", 1,sys.frame(sys.parent())) in R]

y is local in f, and for 

	h<-function() remove("y", frame = sys.parent())
	[h<-function() rm("y", envir=sys.frame(sys.parent())) in R]

y is global in g. This is a major problem for any kind of code
transformation.

Does anyone know of good reasons for allowing the addition or removal
of objects from evaluation frames? Obviously you need it for the
global frame, but is it really useful or even desirable for evaluation
frames?

I know of one example where, roughly,

	f<-function(g) g()

and there was a need to pass extra information to g, which was done by
squirreling things on some frame or other that g knew about. But in R
this is best done with a lexical closure, and even in S one can do it
in a more disciplined fashion by implementing a system for dynamically
scoped variables. (The example I am recalling vaguely used a lot of
rather impenetrable frame manipulation essentially to create a shallow
binding implementation of dynamic scoping.) That is the only case I
can recall, and in that case, as I recall it, one could have done a
better job.

The upshot here (unless anyone can come up with a really convincing
counter argument, and maybe even then :-)) is that I would like to see
the ability to add variables to evaluation frames by any mechanism
other than an explicit use of "<-" removed from the language. The
ability to remove a variable from an evaluation frame should be
eliminated entirely. (At the very least it should be accepted that
this is something that might work in "debug mode" but not otherwise).
To restate the reason why: If the structure of the frames is
determined by the static definition of the function, not by run time
considerations, then

	function(y) { ...; y ;...}
	function(x,y) { ...; x + y; ...}

can be converted to

	function(y) { ...; <access frame element 1>; ...}
	function(x, y) { ...; <primitive-add x y>; ...}

which, when combined with similar optimizations, could produce
considerably faster code. Without the ability to determine at compile
time what variables mean (and functions and operators like + are
variables too) none of this is guaranteed to produce equivalent code,
and basically no optimizations are possible.

I've said my piece -- flame away :-)

luke
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-testers mailing list -- To (un)subscribe, send
subscribe	or	unsubscribe
(in the "body", not the subject !)  To: r-testers-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-