[R] reading large data

From: HENRIKSON, JEFFREY <JEFHEN_at_safeco.com>
Date: Sat 03 Jul 2004 - 02:37:46 EST


I have trouble using read.table for flat files of larger than about 300MB on windows 2000. Any ideas of how to file a bug report? Is it a known issue? I have three cuts of data, a 1%, 10% and 100% sample in flat text files. The 100% sample is about 350MB. When I read the 1% and 10% files, besides being slow, everything works. RAM footprint appears to increase approximately 2x of text file size when loaded. I have 1.5GB of ram on my machine. The 10% file takes < 1.5 minutes to load. So the 100% file I would think would load in < 15 minutes. But it grinds for about 15 mins and then seg faults instead. I don't think there's really very special about my data. Just several columns by ~5M rows.

The same thing happens when I read the 100% sample in from an RDBMS with RODBC. For the time being I have worked around by feeding in small cross sections 100% from the RDBMS, and storing a 10% whole sample in RAM. But in the future it would be nice if I could just use the RAM in my box.

Jeff Henrikson

R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jul 03 02:41:15 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:16:55 EST