Re: [R] Memory capacity question for a specific situation

From: Uwe Ligges <ligges_at_statistik.tu-dortmund.de>
Date: Fri, 20 May 2011 16:03:12 +0200

On 20.05.2011 15:33, Dimitri Liakhovitski wrote:
> Hello!
>
> I am trying to figure out if my latest R for 64 bits on a 64-bit
> Windows 7 PC, RAM = 6 GB could read in a dataset with:
>
> ~64 million rows
> ~30 columns about half of which contain integers (between 1 and 3
> digits) and half - numeric data (tens to thousands).
>
> Or is it too much data?
> And even if it could read it in - will there be any memory left to
> conduct, for example, cluster analysis on that data set...

Let us ask R:

 > 64e6 * (15*8 + 15*4)
[1] 1.152e+10

That means you will need roughly 12 GB to store the data in memory. To work with the data, you should have at least 3 times the amount of memory available. Hence a 32 GB machine is a minimal requirement if you cannot restrict yourself to less observations or variables.

Uwe Ligges

>
> Thanks a lot!
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 20 May 2011 - 14:13:31 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 20 May 2011 - 14:30:08 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive