Re: [R] importing a VERY LARGE database from Microsoft SQL into R

From: roger bos <roger.bos_at_gmail.com>
Date: Wed 25 Jan 2006 - 23:59:15 EST

That's good news that the 3GB switch is the default in R2.2.1, I will upgrade from R2.2.0 today. Its wasn't hard to modify the header, but there was at least one person I used to email the modified file to because he did not have access to the editbin software, so now we don't have to do that anymore.

Thanks for mentioning the windows setting, I forgot to mention that you also have to modify the Boot.ini file as described here: http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx

Does anyone have code that keeps generating random data until the memory is full and then tells you how much memory was successfully used? I could try writing it, but if someone has already done it, thats all the better!

Thanks,

Roger

On 1/25/06, Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
>
> On Tue, 24 Jan 2006, roger bos wrote:
>
> > This question comes up a number of times what most people will tell you
> is
> > that even if you get all you data into R you won't be able to do
> anything
> > with it. By that I mean, you need about 3 or 4 times as much memory as
> the
> > size of your data object because R will need to create copies of it.
> >
> > I can tell you what I do in case it helps. I also have a SQL Server
> > database and the good thing about having the data in that format is that
> you
> > probably don't need all of the data all of the time. First of all, a
> windows
> > machine can handle up to 4GB of RAM, but most software cannot use all of
>
> Or even up to 32Gb of RAM. But a single process is limited to 3Gb of user
> address space, and to 2Gb unless you tell the OS to allow more.
>
> > it by default. I have 4GB and I also use the windows binary, so that
> means
> > that whenever I download a new version of R, I have to modify the header
> > file to the it LARGEADDRESSAWARE. Using this trick, I can load up my
> big
> > matrix into R to the point where task manager shows that R is using
> about
> > 1.7GB of memory. Despite such large objects, I am able to do
> regressions
> > and other things with the data, so I am quite happy. If you need more
> > details just let me know.
>
> This is the default in the current R 2.2.1, but you also need to set the
> /3GB flag in Windows. I found the limit was about 1.7Gb without doing
> that, more like 2.5Gb when I did.
>
> The details are in the rw-FAQ Q2.9 (and vary by R version, so please
> consult that in the version of R you use).
>
> >
> >
> >
> >
> >
> >
> >
> > On 1/24/06, r user <ruser2006@yahoo.com> wrote:
> >>
> >> I am using R 2.1.1 in a Windows Xp environment.
> >>
> >> I need to import a large database from Microsoft SQL
> >> into R.
> >>
> >> I am currently using the "sqlQuery" function/command.
> >>
> >> This works, but I sometimes run out of memory if my
> >> database is too big, or it take quite a long time for
> >> the data to import into R.
> >>
> >> Is there a better way to bring a large SQL database
> >> into R?
> >>
> >> IS there an efficient way to convert the data into R
> >> format prior to bringing it into R? (E.g. directly
> >> from Microsoft SQL?)
> >>
> >> ______________________________________________
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide!
> >> http://www.R-project.org/posting-guide.html
> >>
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jan 26 00:13:23 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:10 EST