Re: [Rd] garbage collection & memory leaks in 'R', it seems...

From: Peter Dalgaard <pdalgd_at_gmail.com>
Date: Sat, 17 Jul 2010 11:11:40 +0200

Mike Williamson wrote:
> Hello developers,
>
> I noticed that if I am running 'R', type "rm(list=objects())" and
> "gc()", 'R' will still be consuming (a lot) more memory than when I then
> close 'R' and re-open it. In my ignorance, I'm presuming this is something
> in 'R' where it doesn't really do a great job of garbage collection... at
> least not nearly as well as Windows or unix can do garbage collection.
> Am I right? If so, is there any better way to "clean up" the memory
> that 'R' is using? I have a script that runs a fairly large job, and I
> cannot keep it going on its own in a convenient way because of these
> remnants of garbage that pile up and eventually leave so little memory
> remaining that the script crashes.

In a word, no, R is not particularly bad at GC. The internal gc() does a rather good job of finding unused objects as you can see from its returned report. Whether that memory is returned to the OS is a matter of the C-level services (malloc/free) that R's allocation routines use.

As far as I recall, Windows free() just never returns memory to the OS. In general, whether it can be done at all depends on which part of the "heap" you have freed since you have to free from the end of it. (I.e.,  having a tiny object sitting at the end of the heap will force the entire range to be kept in memory.)

R itself will allocate from freed-up areas of the heap as long as it can find a space that is big enough. However, there is always a tendency for memory to fragmentize so that you eventually have a pattern of many small objects with not-quite-big-enough holes between them.

These issues affect most languages that do significant amounts of object allocation and destruction. You should not really compare it to OS level memory management because that's a different kettle of fish. In particular, user programs like R relies on having all objects mapped to a single linear address space, whereas the OS "just" needs to create a set of per-process virtual address spaces and has hardware help to do so.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes_at_cbs.dk  Priv: PDalgd_at_gmail.com

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat 17 Jul 2010 - 09:14:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 20 Jul 2010 - 18:50:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive