Re: [R] How to handle large dataframes?

From: Søren Højsgaard <>
Date: Wed 15 Feb 2006 - 02:24:45 EST

I think it is well worth the effort to start using a database system like e.g. MySql for such purposes.  

If you look at then you'll find a short - and rudimentary - description of how to use MySql in connection with R and SAS (on Windows).  

The time you'll have to spend to get it up and running (about 30 minutes) is well spent. I suppose you can take your stata data and save as a comma separate file. Such a file is easy to put into a MySql database (although I haven't written how). Perhaps Stata can connect directly to MySql?  

Best regards

Fra: på vegne af Christian Bieli Sendt: ti 14-02-2006 15:24
Til: R help list
Emne: [R] How to handle large dataframes?

Dear all

I imported a Stata .dta file with the read.dta-function from the foreign-package. The dataframe's dimensions are

> dim(d.apc)
[1] 15806 1300

Importing needs up to 15 min and calculations with these data are rather slow (although I subset the data before starting analyses).

My questions are:
1. Has someone experiences importing Stata files (alternatives to read.dta) ?
2. To my knowledge R should not have problems handling dataframes of this size. Is there something I can do after importing that makes data handling faster?

My hardware is up-to-date (Intel P4, 3 Ghz, 1 GB RAM) and I work on a Windows XP platform.
I am working on a Windows XP platform with R version 2.1 (all packages updated).

Thanks for your answers.

Christian Bieli, project assistant
Institute of Social and Preventive Medicine
University of Basel, Switzerland
Steinengraben 49
CH-4051 Basel
Tel.: +41 61 270 22 12
Fax:  +41 61 270 22 25

______________________________________________ mailing list
PLEASE do read the posting guide!

______________________________________________ mailing list
PLEASE do read the posting guide!
Received on Wed Feb 15 02:28:43 2006

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:42:29 EST