Re: [R] importing a VERY LARGE database from Microsoft SQL into R

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed 25 Jan 2006 - 18:47:59 EST

On Tue, 24 Jan 2006, roger bos wrote:

> This question comes up a number of times what most people will tell you is
> that even if you get all you data into R you won't be able to do anything
> with it. By that I mean, you need about 3 or 4 times as much memory as the
> size of your data object because R will need to create copies of it.
>
> I can tell you what I do in case it helps. I also have a SQL Server
> database and the good thing about having the data in that format is that you
> probably don't need all of the data all of the time. First of all, a windows
> machine can handle up to 4GB of RAM, but most software cannot use all of

Or even up to 32Gb of RAM. But a single process is limited to 3Gb of user address space, and to 2Gb unless you tell the OS to allow more.

> it by default. I have 4GB and I also use the windows binary, so that means
> that whenever I download a new version of R, I have to modify the header
> file to the it LARGEADDRESSAWARE. Using this trick, I can load up my big
> matrix into R to the point where task manager shows that R is using about
> 1.7GB of memory. Despite such large objects, I am able to do regressions
> and other things with the data, so I am quite happy. If you need more
> details just let me know.

This is the default in the current R 2.2.1, but you also need to set the /3GB flag in Windows. I found the limit was about 1.7Gb without doing that, more like 2.5Gb when I did.

The details are in the rw-FAQ Q2.9 (and vary by R version, so please consult that in the version of R you use).

>
>
>
>
>
>
>
> On 1/24/06, r user <ruser2006@yahoo.com> wrote:
>>
>> I am using R 2.1.1 in a Windows Xp environment.
>>
>> I need to import a large database from Microsoft SQL
>> into R.
>>
>> I am currently using the "sqlQuery" function/command.
>>
>> This works, but I sometimes run out of memory if my
>> database is too big, or it take quite a long time for
>> the data to import into R.
>>
>> Is there a better way to bring a large SQL database
>> into R?
>>
>> IS there an efficient way to convert the data into R
>> format prior to bringing it into R? (E.g. directly
>> from Microsoft SQL?)
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed Jan 25 18:56:25 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:10 EST