Re: [R] How to handle large dataframes?

From: roger bos <roger.bos_at_gmail.com>
Date: Wed 15 Feb 2006 - 02:38:03 EST

Soren's suggestion is the right way to go provided you don't need all the data all the time. Another thing to try is once the data is imported, convert the numeric part of the data frame to a matrix, as calculations on a matrix are much faster than calculations on a data frame.

I'm too lazy to convert my data frames to matrix form, I program during the day and let my computer do most of the calculations overnight.

HTH, Roger

On 2/14/06, Søren Højsgaard <Soren.Hojsgaard@agrsci.dk> wrote:
>
> I think it is well worth the effort to start using a database system like
> e.g. MySql for such purposes.
>
> If you look at
> http://gbi.agrsci.dk/~sorenh/misc/R-SAS-MySql/R-SAS-MySql.html
> then you'll find a short - and rudimentary - description of how to use
> MySql in connection with R and SAS (on Windows).
>
> The time you'll have to spend to get it up and running (about 30 minutes)
> is well spent. I suppose you can take your stata data and save as a comma
> separate file. Such a file is easy to put into a MySql database (although I
> haven't written how). Perhaps Stata can connect directly to MySql?
>
> Best regards
> Søren
>
> ________________________________
>
> Fra: r-help-bounces@stat.math.ethz.ch på vegne af Christian Bieli
> Sendt: ti 14-02-2006 15:24
> Til: R help list
> Emne: [R] How to handle large dataframes?
>
>
>
> Dear all
>
> I imported a Stata .dta file with the read.dta-function from the
> foreign-package. The dataframe's dimensions are
>
> > dim(d.apc)
> [1] 15806 1300
>
> Importing needs up to 15 min and calculations with these data are rather
> slow (although I subset the data before starting analyses).
>
> My questions are:
> 1. Has someone experiences importing Stata files (alternatives to
> read.dta) ?
> 2. To my knowledge R should not have problems handling dataframes of
> this size. Is there something I can do after importing that makes data
> handling faster?
>
> My hardware is up-to-date (Intel P4, 3 Ghz, 1 GB RAM) and I work on a
> Windows XP platform.
> I am working on a Windows XP platform with R version 2.1 (all packages
> updated).
>
> Thanks for your answers.
> Christian
>
> --
> Christian Bieli, project assistant
> Institute of Social and Preventive Medicine
> University of Basel, Switzerland
> Steinengraben 49
> CH-4051 Basel
> Tel.: +41 61 270 22 12
> Fax: +41 61 270 22 25
> christian.bieli@unibas.ch
> www.unibas.ch/ispmbs
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Feb 15 02:45:51 2006

This archive was generated by hypermail 2.1.8 : Wed 15 Feb 2006 - 14:35:56 EST