Re: [R] How to read only specified columns from a data file

From: Sarah Goslee <sarah.goslee_at_gmail.com>
Date: Wed, 16 Mar 2011 08:13:58 -0400

read.table() looks at the first five rows when determining how many columns there are. If there are more columns in row 7 and you do not specify that in the read.table() command directly, they will be wrapped to the next row.

This was discussed on the list within the last couple weeks.

Sarah

On Wed, Mar 16, 2011 at 7:54 AM, Luis Ridao <luridao_at_gmail.com> wrote:
> David,
>
> Thanks for your tip but it seems I'm having problems with the number
> of columns R manages to read in. Below it s an example of the data read in:
>
>> inp[1:20,]
>        V1          V2        V3       V4     V5     V6     V7     V8     V9
> 1   1.0000 log_fy_coff -1.007600 0.119520 1.0000     NA            NA     NA
> 2   2.0000 log_fy_coff -0.935010 0.112840 0.8896 1.0000            NA     NA
> 3   3.0000 log_fy_coff -0.876260 0.107500 0.8219 0.8847 1.0000     NA     NA
> 4   4.0000 log_fy_coff -0.683090 0.103030 0.7656 0.8143 0.8747 1.0000     NA
> 5   5.0000 log_fy_coff -0.623500 0.100980 0.7206 0.7636 0.8086 0.8764 1.0000
> 6   6.0000 log_fy_coff -0.583330 0.098978 0.6819 0.7214 0.7615 0.8150 0.8762
> 7   1.0000                    NA       NA     NA     NA            NA     NA
> 8   7.0000 log_fy_coff -0.676790 0.096608 0.6521 0.6892 0.7254 0.7719 0.8148
> 9   0.8717      1.0000        NA       NA     NA     NA            NA     NA
> 10  8.0000 log_fy_coff -0.696060 0.093761 0.6297 0.6654 0.6988 0.7405 0.7750
> 11  0.8116      0.8643  1.000000       NA     NA     NA            NA     NA
> 12  9.0000 log_fy_coff -0.527060 0.089949 0.6003 0.6347 0.6667 0.7060 0.7367
>
> as you see there are only 9 columns in inp and the rest is read in in
> the following row(see row 7)
> I just don't understand why this is happening (using fill=T does not
> help either)
>
> Best,
> Luis
>
> On Tue, Mar 15, 2011 at 5:15 PM, David Winsemius <dwinsemius_at_comcast.net> wrote:
>>
>> On Mar 15, 2011, at 1:11 PM, <rex.dwyer_at_syngenta.com> wrote:
>>
>>> I think you need to read an introduction to R.
>>> For starters, read.table returns its results as a value, which you are not
>>> saving.
>>> The probable answer to your question:
>>> Read the whole file with read.table, and select columns you need, e.g.:
>>> tab <- read.table(myfile, skip=2)[,1:5]
>>>
>>> -----Original Message-----
>>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
>>> On Behalf Of Luis Ridao
>>> Sent: Tuesday, March 15, 2011 11:53 AM
>>> To: r-help_at_r-project.org
>>> Subject: [R] How to read only specified columns from a data file
>>>
>>> R-help,
>>>
>>> I'm trying to read a data file with plenty of columns.
>>> I just need the first 5 but it doe not work by doing something like:
>>>
>>>> mycols <- rep(NULL, 430) ; mycols[c(1:4)] <- NA
>>>> read.table(myfile, skip=2, colClasses=mycols)
>>
>> I would have suggested:
>>
>> mycols <- rep(NULL, 430) ; mycols[1:5] <- rep("numeric", 5)
>> inp <- read.table(myfile, skip=2, colClasses=mycols)
>> head(inp)
>>
>> --
>> David.
>>
>>>
>>> Any suggestions?
>>>

-- 
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 16 Mar 2011 - 12:17:39 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 16 Mar 2011 - 13:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive