Re: [R] naive question

From: Tony Plate <tplate_at_blackmesacapital.com>
Date: Thu 01 Jul 2004 - 01:34:25 EST

As far as I know, read.table() in S-plus performs similarly to read.table() in R with respect to speed. So, I wouldn't put high hopes in finding much satisfaction there.

I do frequently read large tables in S-plus, and with a considerable amount of work was able to speed things up significantly, mainly by using scan() with appropriate arguments. It's possible that some of the add-on modules for S-plus (e.g., the data-mining module) have faster I/O, but I haven't investigated those. I get the best read performance out of S-plus by using a homegrown binary file format with each column stored in a contiguous block of memory and meta data (i.e., column types and dimensions) stored at the start of the file. The S-plus read function reads the columns one at a time using readRaw(). One would be able to do something similar in R. If you have to read from a text file, then, as others have suggested, writing a C program wouldn't be that hard, as long as you make the format inflexible.

At Tuesday 06:19 PM 6/29/2004, Igor Rivin wrote:

>I was not particularly annoyed, just disappointed, since R seems like
>a much better thing than SAS in general, and doing everything with a
>combination
>of hand-rolled tools is too much work. However, I do need to work with
>very large data sets, and if it takes 20 minutes to read them in, I have
>to explore other
>options (one of which might be S-PLUS, which claims scalability as a major
>, er, PLUS over R).
>
>______________________________________________
>R-help@stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Jul 01 01:38:35 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 07:49:23 EST