Re: [R] Reading huge chunks of data from MySQL into Windows R

From: Duncan Murdoch <murdoch_at_stats.uwo.ca>
Date: Mon 06 Jun 2005 - 23:49:22 EST

On 6/6/2005 9:30 AM, Dubravko Dolic wrote:
> Dear List,
>
>
>
> I'm trying to use R under Windows on a huge database in MySQL via ODBC
> (technical reasons for this...). Now I want to read tables with some
> 160.000.000 entries into R. I would be lucky if anyone out there has
> some good hints what to consider concerning memory management. I'm not
> sure about the best methods reading such huge files into R. for the
> moment I spilt the whole table into readable parts stick them together
> in R again.
>
>
>
> Any hints welcome.

Most values in R are stored in 8 byte doubles, so 160,000,000 entries will take roughly a gigabyte of storage. (Half that if they are integers or factors.) You are likely to run into problems manipulating something that big in Windows, because users are normally only allowed 2 GB of the memory address space, and it can be fragmented.

I'd suggest developing algorithms that can work on the data a block at a time, so that you never need to stick the whole thing together in R at once. Alternatively, switch to a 64 bit platform and install lots of memory -- but there are still various 4 GB limits in R, so you may still run into trouble.

Duncan Murdoch



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Jun 07 00:27:04 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:23 EST