Re: R-beta: read.table and large datasets

Ross Ihaka (
Tue, 10 Mar 1998 08:08:13 +1300 (NZDT)

Date: Tue, 10 Mar 1998 08:08:13 +1300 (NZDT)
From: Ross Ihaka <>
Message-Id: <>
Subject: Re: R-beta: read.table and large datasets

RW> From: Rick White <>
RW> Subject: R-beta: read.table and large datasets
RW> I find that read.table cannot handle large datasets. Suppose data is a
RW> 40000 x 6 dataset
RW> R -v 100
RW> x_read.table("data")  gives
RW> Error: memory exhausted
RW> but
RW> works fine.
RW> read.table is less typing ,I can include the variable names in the first
RW> line and in Splus executes faster. Is there a fix for read.table on the
RW> way?

[ I wouldn't be too sure that read.table executes faster.  I think
  it just calls scan ... ]

This is a known R problem.  The real problem is that read.table reads
everything as character strings and the implementation of character
strings is "suboptimal".  This is a low-level problem and such problems
are fairly hard to fix because any changes affect almost every bit of

As a temporary fix you might try enlarging the memory used for "cons cells"
with the -n flag.  Try something like

	R -n 400000 -v 10

Longer term, something will be done about it, but don't hold your breath.

r-help mailing list -- Read
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: