Re: [R] count.fields vs read.table

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Mon 05 Dec 2005 - 19:02:19 EST

On Mon, 5 Dec 2005, Peter Dalgaard wrote:

> "Andrew C. Ward" <acward@tpg.com.au> writes:
>
>> Dear R-help,
>>
>> I am using R 2.1.1 on Windows XP.
>>
>> I have a tab-delimited data file that has been exported by SAS. The file is reasonably big so I
>> apologise that I can't give a good toy example. I do this:
>> table(count.fields("t1.txt", sep="\t", quote="\""))
>> 248
>> 809
>> So I have 809 lines, each with 248 fields.
>>
>> There's something wrong with me, my data or both, since when I try to read the data, I get this:
>> dim(read.table("t1.txt", sep="\t", quote="\"", header=TRUE)
>> [1] 425 248
>>
>> I wonder if someone could be kind enough to point out what I've done wrong or suggest some tips
>> for managing this, please? Thanks for your advice!
>
>
> Something around line 425 that causes the rest of the file to be
> gobbled? Quotes and comment characters could be the culprit, although
> the inconsistency with count.fields looks suspicious. Otherwise, I'd
> look at the data read and try to pinpoint the line where things go
> weird (e.g. the last handful of entries of the first column).

count.fields explicitly says it counts lines, and read.table allows embedded newlines in quoted fields. These days they don't do exactly the same thing.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Mon Dec 05 19:08:02 2005

This archive was generated by hypermail 2.1.8 : Mon 05 Dec 2005 - 20:24:41 EST