[R] gc() and memory efficiency

From: Doran, Harold <HDoran_at_air.org>
Date: Mon, 4 Feb 2008 20:45:33 -0500

I have a program which reads in a very large data set, performs some analyses, and then repeats this process with another data set. As soon as the first set of analyses are complete, I remove the very large object and clean up to try and make memory available in order to run the second set of analyses. The process looks something like this:

  1. read in data set 1 and perform analyses rm(list=ls()) gc()
  2. read in data set 2 and perform analyses rm(list=ls()) gc() ...

But, it appears that I am not making the memory that was consumed in step 1 available back to the OS as R complains that it cannot allocate a vector of size X as the process tries to repeat in step 2.

So, I close and reopen R and then drop in the code to run the second analysis. When this is done, I close and reopen R and run the third analysis.

This is terribly inefficient. Instead I would rather just source in the R code and let the analyses run over night.

Is there a way that I can use gc() or some other function more efficiently rather than having to close and reopen R at each iteration?

I'm using Windows XP and r 2.6.1


        [[alternative HTML version deleted]]

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 05 Feb 2008 - 01:48:06 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 05 Feb 2008 - 07:30:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive