Re: [R] read file part way through based on start and end date (first column)

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon, 21 Mar 2011 00:16:41 -0400

On Sun, Mar 20, 2011 at 3:47 PM, algotr8der <algotr8der_at_gmail.com> wrote:
> Hello folks - I have been trying to figure this out. I have a set of very
> large files that are of this format
>
> , , , ,
> 1/4/1999,9:31:00 AM,blah, blah, blah
> 1/4/1999,9:32:00 AM,blah, blah, blah
> 1/4/1999,9:33:00 AM,blah, blah, blah
>
> I want to write R code that reads only that data between a start and an end
> date (data is presented from oldest at the top of the file to the most
> recent at the bottom of the file). I'm not sure if there is an R function
> that makes this easy.
>
> I know the read.csv function enables you to skip a user specified number of
> rows before the file is read but this doesnt exactly help me as my start and
> end dates can be anywhere in between.
>

Try reading the entire file into R first to be really sure that you are not just assuming it can't be done.

If its true that its too big to read it in and subset then try reading just the first column of the file (read about the colClasses= argument in ?read.table) and then figure out which rows you need from the first column and re-read the file, this time using the skip= and nrows= argument so that it only reads in the rows you need.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 21 Mar 2011 - 04:25:32 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 21 Mar 2011 - 04:30:23 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive