Re: [R] renaming objects

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Tue, 04 Mar 2008 07:06:01 +0000 (GMT)

On Mon, 3 Mar 2008, Nordlund, Dan (DSHS/RDA) wrote:

[..., quoting Hadley Wickham]

>>>> gc()
>>> used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells 133095 3.6 350000 9.4 350000 9.4
>>> Vcells 87049 0.7 786432 6.0 478831 3.7
>>>> a <- runif(1e7)
>>>> gc()
>>> used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells 133112 3.6 350000 9.4 350000 9.4
>>> Vcells 10087364 77.0 11458389 87.5 10087374 77.0
>>>> b <- a
>>>> gc()
>>> used (Mb) gc trigger (Mb) max used (Mb)
>>> Ncells 133117 3.6 350000 9.4 350000 9.4
>>> Vcells 10087365 77.0 12111308 92.5 10087476 77.0
>>>
>>> R will only create a copy if either of a or b is modified.

> But, the OP should know that in the above scenario, if a or b is changed 
> the copy will be created, doubling the storage requirements.  Of course, 
> this can be prevented by removing vector a after the assignment.

Hadley was correct: it is not prevented by removing 'a', as R does not have reference counting. E.g.

rm(a)
b[1] <- 1
gc()

            used (Mb) gc trigger  (Mb) max used  (Mb)
Ncells   133947  3.6     350000   9.4   350000   9.4
Vcells 10087573 77.0 21337085 162.8 20087562 153.3

Note the 'max used' Vcells.

There's a fairly complete explanation of what happens in the 'R Internals' manual.

I think the most common source of confusion is over the term 'objects'. R does not have 'objects' in this sense: 'a' and 'b' are symbols with bindings to values. So you cannot change 'b', but you can change its binding. When you do b[1] <- 1 you may create a new C-level structure as the new value, or you may change the existing one. In this case it created a new structure (by copying the old one and altering that). Certain replacement functions are the only way to avoid making a new value: a <- a+0 for example always creates a new value (at a different address in memory) even though its contents will be identical.

Once we get away from the simplest vectors more sharing can be done: e.g. character vectors with duplicate elements will share storage for those elements.

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 04 Mar 2008 - 07:11:58 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 04 Mar 2008 - 09:30:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive