Re: [R] large data set, error: cannot allocate vector

From: Robert Citek <rwcitek_at_alum.calberkeley.org>
Date: Sat 06 May 2006 - 08:15:13 EST

On May 5, 2006, at 11:30 AM, Thomas Lumley wrote:
> In addition to Uwe's message it is worth pointing out that gc()
> reports
> the maximum memory that your program has used (the rightmost two
> columns).
> You will probably see that this is large.

Reloading the 10 MM dataset:

R > foo <- read.delim("dataset.010MM.txt")

R > object.size(foo)
[1] 440000376

R > gc()

            used (Mb) gc trigger (Mb) max used (Mb) Ncells 10183941 272.0 15023450 401.2 10194267 272.3 Vcells 20073146 153.2 53554505 408.6 50086180 382.2

Combined, Ncells or Vcells appear to take up about 700 MB of RAM, which is about 25% of the 3 GB available under Linux on 32-bit architecture. Also, removing foo seemed to free up "used" memory, but didn't change the "max used":

R > rm(foo)

R > gc()

          used (Mb) gc trigger (Mb) max used (Mb) Ncells 186694 5.0 12018759 321.0 10194457 272.3 Vcells 74095 0.6 44173915 337.1 50085563 382.2

Regards,
- Robert
http://www.cwelug.org/downloads
Help others get OpenSource software. Distribute FLOSS for Windows, Linux, *BSD, and MacOS X with BitTorrent



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat May 06 08:25:34 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 09 May 2006 - 02:09:59 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.