Re: [R] Memory Management under Linux: Problems to allocate large amounts of data

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed 29 Jun 2005 - 23:18:05 EST

Let's assume this is a 32-bit Xeon and a 32-bit OS (there are 64-bit-capable Xeons). Then a user process like R gets a 4GB address space, 1GB of which is reserved for the kernel. So R has a 3GB address space, and it is trying to allocate a 2GB contigous chunk. Because of memory fragmentation that is quite unlikely to succeed.

We run 64-bit OSes on all our machines with 2GB or more RAM, for this reason.

On Wed, 29 Jun 2005, Dubravko Dolic wrote:

> Dear Group
>
> I'm still trying to bring many data into R (see older postings). After
> solving some troubles with the database I do most of the work in MySQL.
> But still I could be nice to work on some data using R. Therefore I can
> use a dedicated Server with Gentoo Linux as OS hosting only R. This
> Server is a nice machine with two CPU and 4GB RAM which should do the
> job:
>
> Dual Intel XEON 3.06 GHz
> 4 x 1 GB RAM PC2100 CL2
> HP Proliant DL380-G3
>
> I read the R-Online help on memory issues and the article on garbage
> collection from the R-News 01-2001 (Luke Tierney). Also the FAQ and some
> newsgroup postings were very helpful on understanding memory issues
> using R.
>
> Now I try to read data from a database. The data I wanted to read
> consists of 158902553 rows and one field (column) and is of type
> bigint(20) in the database. I received the message that R could not
> allocate the 2048000 Kb (almost 2GB) sized vector. As I have 4BG of RAM
> I could not imagine why this happened. In my understanding R under Linux
> (32bit) should be able to use the full RAM. As there is not much space
> used by OS and R as such ("free" shows the use of app. 670 MB after
> dbSendQuery and fetch) there are 3GB to be occupied by R. Is that
> correct?

Not really. The R executable code and the Ncells are already in the address space, and this is a virtual memory OS, so the amount of RAM is not relevant (it would still be a 3GB limit with 12GB of RAM).

> After that I started R by setting n/vsize explicitly
>
> R --min-vsize=10M --max-vsize=3G --min-nsize=500k --max-nsize=100M
>
>> mem.limits()
> nsize vsize
> 104857600 NA
>
> and received the same message.
>
>
> A garbage collection delivered the following information:
>
>> gc()
> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
> Ncells 217234 5.9 500000 13.4 2800 500000 13.4
> Vcells 87472 0.7 157650064 1202.8 3072 196695437 1500.7
>
>
> Now I'm at a loss. Maybe anyone could give me a hint where I should read
> further or which Information can take me any further

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Jun 29 23:21:11 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:06 EST