Re: [R] Row limit for read.table

From: Martin Becker <>
Date: Wed 17 Jan 2007 - 16:40:10 GMT

Frank McCown schrieb:
> I have been trying to read in a large data set using read.table, but
> I've only been able to grab the first 50,871 rows of the total 122,269 rows.
> > f <-
> read.table("",
> header=TRUE, nrows=123000, comment.char="", sep="\t")
> > length(f$change_rate)
> [1] 50871
> From searching the email archives, I believe this is due to size limits
> of a data frame. So...
It is not due to size limits, see below.
> 1) Why doesn't read.table give a proper warning when it doesn't place
> every read item into a data frame?
In your case, read.table behaves as documented. The ' - character is one of the standard quoting characters. Some (but very few) of the entrys contain single ' chars, so sometimes more than ten thousand lines are just treated as a single entry. Try using quote="" to disable quoting, as documented on the help page:

f<-read.table("", header=TRUE, nrows=123000, comment.char="", sep="\t",quote="")

[1] 122271

> 2) Why isn't there a parameter to read.table that allows the user to
> specify which columns s/he is interested in? This functionality would
> allow extraneous columns to be ignored which would improve memory usage.
There is (colClasses, works as documented). Try


+ header=TRUE, nrows=123000, comment.char="", sep="\t",quote="",colClasses=c("character","NULL","NULL","NULL","NULL"))  > dim(f)
[1] 122271 1

> I've already made a work-around by loading the table into mysql and
> doing a select on the 2 columns I need. I just wonder why the above 2
> points aren't implemented. Maybe they are and I'm totally missing it.
Did you read the help page?

> Thanks,
> Frank


   Martin mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Thu Jan 18 03:44:40 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 17 Jan 2007 - 17:30:24 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.