Re: R-alpha: read.table -- programming 'contest'

Martin Maechler (
Wed, 5 Mar 97 15:25:45 +0100

Date: Wed, 5 Mar 97 15:25:45 +0100
Message-Id: <9703051425.AA01252@>
From: Martin Maechler <>
In-Reply-To: <> (message from Martyn
Subject: Re: R-alpha: read.table -- programming 'contest' 

>>>>> "Martyn" == Martyn Plummer <> writes:

    Martyn> Looking at the code for read.table, I see that it reads the
    Martyn> whole dataset in as character data (using the scan() function),
    Martyn> before coercing it to numeric or factor data with the function
    Martyn> type.convert. Could this be the reason for the excessive memory
    Martyn> usage? Try using the scan() function instead.

This makes for our first "R programmer's contest" :  (;-)

Who writes the "best"   
	read.table  "drop-in replacement" in R (no C code)?

'best':= "Sum" of the following criteria:
	1) Must have the same or better functionality than 0.16.1
	2) CPU usage	  when reading medium / large datasets.
	3) Memory usage when reading   medium / large datasets.
	4) Elegance of code.

[and  "Who is the jury?"  ;-)]

Actually, I think it does NOT make sense go for pure R code
(with no C-code or  system(..) calls).
As long as we are using Unix (or Windows NT/95  with  GNU Unix tools ??),
the most efficient solution (w/o C code) will probably use
	system( ... sed / awk / perl ...).

But then, read.table(.) will be much harder o port to Windows/ Mac....

r-testers mailing list -- For info or help, send "info" or "help",
To [un]subscribe, send "[un]subscribe"
(in the "body", not the subject !)  To: