Re: [Rd] Return function from function with minimal environment

From: Henrik Bengtsson <hb_at_maths.lth.se>
Date: Tue 04 Apr 2006 - 15:34:35 GMT

On 4/4/06, Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
> On Tue, 4 Apr 2006, Roger D. Peng wrote:
>
> > In R 2.3.0-to-be, I think you can do
> >
> > foo <- function(huge) {
> > scale <- mean(huge)
> > g <- function(x) { scale * x }
> > environment(g) <- emptyenv()
> > g
> > }
>
> You can, but you really don't want to and you will get the same error. A
> 'minimal environment' is baseenv(), since you are not going to be able to
> do very much without the primitives such as * (and even "{"), and
> (relevant here) you will never save anything from it so it will cost you
> nothing. In this example, 'scale' is supposed to being picked up from the
> environment. So (removing all the empty statements to save memory)
>
> foo <- function(huge) {
> scale <- mean(huge)
> env <- new.env(parent=baseenv())
> assign("scale", scale, envir=env)
> bar <- function(x) { scale * x }
> environment(bar) <- env
> bar
> }
>
> is I think minimal baggage (and fcn saves in 153 bytes).

Thanks. For (mine and other's) record: The base environment is very specially that it is always guaranteed to exists, and that save() knows about this too, because it does not warn about "package ... may not be available when loading".

In the real example I'm trying to do, my returned function calls a function in the stats package. The naive approach is then to do:

foo <- function(huge) {
  mu <- mean(huge)
  parent <- pos.to.env(which("package:stats" == search()));   env <- new.env(parent=parent)
  assign("mu", mu, envir=env)
  bar <- function(n) { rnorm(n, mean=mu) }   environment(bar) <- env
  bar
}

fcn <- foo(1:10)
print(fcn(5))
env <- environment(fcn)
save(env, file="temp.RData")

However, then you get "Warning message: 'package:stats' may not be available when loading". To the best of my understanding right now, it is better to use "::" as below:

foo <- function(huge) {
  mu <- mean(huge)
  env <- new.env(parent=baseenv())
  assign("mu", mu, envir=env)
  bar <- function(n) { stats::rnorm(n, mean=mu) }   environment(bar) <- env
  bar
}

/Henrik

> > -roger
> >
> > Henrik Bengtsson wrote:
> >> Hi,
> >>
> >> this relates to the question "How to set a former environment?" asked
> >> yesterday. What is the best way to to return a function with a
> >> minimal environment from a function? Here is a dummy example:
> >>
> >> foo <- function(huge) {
> >> scale <- mean(huge)
> >> function(x) { scale * x }
> >> }
> >>
> >> fcn <- foo(1:10e5)
> >>
> >> The problem with this approach is that the environment of 'fcn' does
> >> not only hold 'scale' but also the memory consuming object 'huge',
> >> i.e.
> >>
> >> env <- environment(fcn)
> >> ll(envir=env) # ll() from R.oo
> >> # member data.class dimension object.size
> >> # 1 huge numeric 1000000 4000028
> >> # 2 scale numeric 1 36
> >>
> >> save(env, file="temp.RData")
> >> file.info("temp.RData")$size
> >> # [1] 2007624
> >>
> >> I generate quite a few of these and my 'huge' objects are of order
> >> 100Mb, and I want to keep memory usage as well as file sizes to a
> >> minimum. What I do now, is to remove variable from the local
> >> environment of 'foo' before returning, i.e.
> >>
> >> foo2 <- function(huge) {
> >> scale <- mean(huge)
> >> rm(huge)
> >> function(x) { scale * x }
> >> }
> >>
> >> fcn <- foo2(1:10e5)
> >> env <- environment(fcn)
> >> ll(envir=env)
> >> # member data.class dimension object.size
> >> # 1 scale numeric 1 36
> >>
> >> save(env, file="temp.RData")
> >> file.info("temp.RData")$size
> >> # [1] 156
> >>
> >> Since my "foo" functions are complicated and contains many local
> >> variables, it becomes tedious to identify and remove all of them, so
> >> instead I try:
> >>
> >> foo3 <- function(huge) {
> >> scale <- mean(huge);
> >> env <- new.env();
> >> assign("scale", scale, envir=env);
> >> bar <- function(x) { scale * x };
> >> environment(bar) <- env;
> >> bar;
> >> }
> >>
> >> fcn <- foo3(1:10e5)
> >>
> >> But,
> >>
> >> env <- environment(fcn)
> >> save(env, file="temp.RData");
> >> file.info("temp.RData")$size
> >> # [1] 2007720
> >>
> >> When I try to set the parent environment of 'env' to emptyenv(), it
> >> does not work, e.g.
> >>
> >> fcn(2)
> >> # Error in fcn(2) : attempt to apply non-function
> >>
> >> but with the new.env(parent=baseenv()) it works fine. The "base"
> >> environment has the empty environment as a parent. So, I try to do
> >> the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
> >> once again I get
> >>
> >> fcn(2)
> >> # Error in fcn(2) : attempt to apply non-function
> >>
> >> Apparently, I do not understand enough here. Please, enlighten me. In
> >> the meantime I stick with foo2().
> >>
> >> Best,
> >>
> >> Henrik
> >>
> >> ______________________________________________
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> >
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Apr 05 01:59:19 2006

This archive was generated by hypermail 2.1.8 : Tue 04 Apr 2006 - 18:16:47 GMT