Re: [R] Dynamic Dictionary Data Type?

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Fri 03 Jun 2005 - 04:40:38 EST

On 6/2/2005 2:14 PM, Kjetil Brinchmann Halvorsen wrote:

> Prof Brian Ripley wrote:
> 

>> On Thu, 2 Jun 2005, hadley wickham wrote:
>>
>>>> An environment is a hash table, and copying occurs only on
>>>> modification,
>>>> when any language would have to copy in this context.
>>>
>>
>> Caveat: the default is new.env(hash=FALSE), so an environment is a
>> hash table in one sense but not necessarily so in the strict sense.
>>
>>> Yes, I'm aware that copying only occurs on modification. However, it
>>> was my understanding that
>>>
>>> a <- list(a =1)
>>> a$b <- 4
>>>
>>> would create a new copy of a,
>>
>>
>> Thomas was talking about environments, not lists (and $ works
>> differently for them).
>>
>>> whereas in Java say
>>>
>>> HashMap a = new HashMap();
>>> a.put("a", 1);
>>> a.put("b", 2);
>>>
>>> wouldn't create a copy of a. (please bear in mind my java syntax is
>>> very rusty!)
>>>
>>> Caching data implies updating at each step, thus possibly creating n
>>> copies of the list. Is that wrong?
>>
>>
>> It depends what you mean by `copy'. If you expand a hash table you at
>> some point need to re-arrange the hashing of the entries. That's what
>> will happen with an R environment or an R list too. The possible
>> difference is that you might expect the table to have some room for
>> expansion, and in your list example you did not give any.
>>
>> R's operations on lists make more copies than are strictly necessary,
>> but it is not clear that this is more costly than the housekeeping
>> that would otherwise be necessary. In a$b <- 4, the wrapper VECSXP is
>> recreated and the pointers copied across, but the list elements are
>> not copied. For
>> a$a <- 4 it is probable that no copying is done (although if a2 <- a
>> had been done previously, the pending recursive copy would then be done).
>> (It is also possible that I have overlooked something in the rather
>> complex code used to do these subassignments.)
>>
> The original poster asked for caching of results. here is an example 
> using new.env():
> 
> memo <- function(fun) {
>    mem <- new.env() 
>    function(x) {
>        if (exists(as.character(x), envir=mem)) get(as.character(x), 
> envir=mem, inherits=FALSE) else {
>        val <- fun(x)
>        assign(as.character(x), value=val, envir=mem)
>        val }
>    }
> }
> 
>  > fib <- memo(function(n) if(n<=1) 1 else fib(n-1)+fib(n-2))
>  > system.time( fib(300) )
> [1] 0.01 0.00 0.02   NA   NA
> 
> ls(get("mem", env=environment(fib)))
>    *output supressed*
> 
> To compare:
> system.time( {fib2 <- function(n)if(n<=1)1 else 
> fib2(n-1)+fib2(n-2);fib2(30)})
> [1]  8.07  0.08 12.75    NA    NA
> 
> 
> (there is (at least) one problem with this solution: if the global 
> workspace contains a
>  variable `6`, it gives error. Why?)

Your environment mem has parent being the memo function environment, which has parent the global environment, so your `6` is being found there.

You can avoid the search by having "inherits=FALSE" in your call to exists(). Then it won't be necessary in your call to get().

In R up to 2.1.x, you can't avoid mem having a parent environment; hopefully in 2.2.x you can tell R to create mem with an empty parent.

Duncan Murdoch



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jun 03 04:52:45 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:22 EST