Re: [R] Importing Large Dataset from Excel

From: Patrick Connolly <>
Date: Sun, 16 Dec 2007 22:02:29 +1300

On Wed, 12-Dec-2007 at 11:35AM +0100, Peter Dalgaard wrote:

|> Philippe Grosjean wrote:
|> > The problem is often a misspecification of the comment.char argument.
|> > For read.table(), it defaults to '#'. This means that everywhere you
|> > have a '#' char in your Excel sheet, the rest of the line is ignored.
|> > This results in a different number of items per line.
|> >
|> > You should better use read.csv() which provides better default arguments
|> > for your particular problem.
|> > Best,
|> >
|> >
|> Or read.delim/read.delim2, which should be even better at TAB-separated
|> files.
|> In general, be very suspicious of read.table() with such files, not only
|> because of the '#' but also because it expects columns separated by
|> _arbitrary_ amounts of whitespace. I.e., n TABs counts as one, so empty
|> fields are skipped over.

I don't recall that happening with TABs, but a problem can arise when the last (rightmost) column has more than a few empty cells. Occasionally, I've had to resort to adding a dummy column on the right, but as Peter suggests, read.delim is usually less involved.

   ___    Patrick Connolly   
 {~._.~}          		 Great minds discuss ideas    
 _( Y )_  	  	        Middle minds discuss events 
(:_~*~_:) 	       		 Small minds discuss people  
 (_)-(_)  	                           ..... Anon

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 16 Dec 2007 - 09:10:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 16 Dec 2007 - 10:30:19 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.