Re: [R] Failing on reading a "slightly big" dataset

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Mon 05 Jul 2004 - 20:25:00 EST


You are asking read.table to interpret both quote and comment characters in your file. You do seem to have quotes -- are they always matched?

Please read through the Data Import/Export manual and check out all the options.

On Mon, 5 Jul 2004, Ajay Shah wrote:

> I have a file with 4 columns per line, all pipe delimited.
>
> $ wc -l cmie_firm_data.text
> 89325 cmie_firm_data.text
> $ ls -al cmie_firm_data.text
> -rw-r--r-- 1 ajayshah ajayshah 4415637 Jul 5 15:25 cmie_firm_data.text
> $ awk -F\| '(NF != 4)' cmie_firm_data.text
> $ head cmie_firm_data.text
> All figures are for the year 20030331|||
> Company|GVA Less Interest (Rs. thousand)|Interest (Rs. thousand)|GVA (Rs. thousand)
> 'R' INVEST PVT. LTD.|-510.45|0.18|-510.27
> 20 MICRONS LTD.|60700|41200|101900
> 20TH CENTURY FOX CORPN. (INDIA) PVT. LTD.|50|0.33|50.33
> 21ST CENTURY AUTOMOTIVE INDIA LTD.|201.14|0.19|201.33
> 21ST CENTURY ENTERTAINMENT PVT. LTD.|-6.10|0|-6.10
> 21ST CENTURY EQUIPMENTS PVT. LTD.|-1599.53|1262.76|-336.77
> 21ST CENTURY INFRASTRUCTURE (INDIA) PVT. LTD.|140.48|1.74|142.22
> 21ST CENTURY PEST CONTROL SERVICES LTD.|50.21|7.13|57.34
>
> When I try to read this into R, I get a mysterious error, and then it
> reads only 38,244 observations. Any idea what might be going wrong?

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Mon Jul 05 20:29:12 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:54:41 EST