Re: [R] how do I read only specific columns using read.csv or other read function

From: Charles C. Berry <cberry_at_tajo.ucsd.edu>
Date: Sun, 13 Jul 2008 13:00:09 -0700

On Sun, 13 Jul 2008, Juliet Hannah wrote:

> I was not able to follow the solution posted. Could you demonstrate
> this technique on an example
> data set. Thanks!
>
> dat <- data.frame(a = letters[1:3], b = LETTERS[1:3], c = 1:3, d = 3:1)

Using your example:

> dat <- data.frame(a = letters[1:3], b = LETTERS[1:3], c = 1:3, d = 3:1)
> write.csv(dat,file="yourFrame.csv")
> col.pos <- match(c("b","d"), scan("yourFrame.csv",sep=',',what=character(0),nlines=1))
Read 5 items
> con <- pipe( paste( "cut -d, -f",paste(col.pos,collapse=','), " yourFrame.csv",sep=''))
> cols.b.d <- read.csv( con )
> cols.b.d

   b d
1 A 3
2 B 2
3 C 1
>

HTH, Chuck

>
> On Wed, Jul 2, 2008 at 1:13 PM, Charles C. Berry <cberry@tajo.ucsd.edu> wrote:
>> On Wed, 2 Jul 2008, Ben Tupper wrote:
>>
>>>
>>> On Jul 2, 2008, at 6:53 AM, Philip James Smith wrote:
>>>
>>>> Hi R people:
>>>>
>>>> I have huge files with as many as 5000 columns. I'd really like to read
>>>> only certain columns of those files. I know column names I want to read.
>>>>
>>>> I looked at the documentation of read.csv . Although there is a col.names
>>>> option, it allows users to specify the names of the columns, rather than to
>>>> pick the columns of interest.
>>>>
>>>> Any suggestions on how to pick the columns I want to read only, rather
>>>> than the entire file, would be greatly appreciated.
>>
>>
>> There is a unix utility called 'cut' that enables stuff like
>>
>> columns.1.3.5.to.7 <- read.csv( pipe( "cut -d, -f1,3,5-7 your.file" ) )
>>
>> and using
>>
>> col.pos <- match(names.of.variables.you.want,
>> scan("your.file", what=character(0), nlines=1 )
>>
>> will enable you to set up the call to pipe.
>>
>> HTH,
>>
>> Chuck
>>
>>>>
>>>
>>> Hello,
>>>
>>> I think you want explicitly set the colClasses argument such that the
>>> columns you *don't* want are set to NULL and all others are set to
>>> appropriate classes.
>>>
>>> Cheers,
>>> Ben
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Phil Smith
>>>> Duluth, GA
>>>>
>>>> ______________________________________________
>>>> R-help_at_r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> Ben Tupper
>>> PemaquidRiver_at_tidewater.net
>>>
>>> I GoodSearch for Ashwood Waldorf School.
>>>
>>> Raise money for your favorite charity or school just by searching the
>>> Internet with GoodSearch - www.goodsearch.com - powered by Yahoo!
>>>
>>> ______________________________________________
>>> R-help_at_r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> Charles C. Berry (858) 534-2098
>> Dept of Family/Preventive
>> Medicine
>> E mailto:cberry_at_tajo.ucsd.edu UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry_at_tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 13 Jul 2008 - 20:11:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 13 Jul 2008 - 21:32:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive