Re: [R] how to skip certain rows when reading data

From: Henrik Bengtsson <hb_at_stat.berkeley.edu>
Date: Sat 29 Jul 2006 - 11:14:20 EST

Have a look at readTable() in the R.utils package. It can do quite a few thinks like reading subsets of rows, specify colClasses by column names etc. Implementation was done so that memory usage is as small as possible. Note the note on the help page: "WARNING: This method is very much in an alpha stage. Expect it to change.". It should work though.

Examples:

# Read every forth row
df <- readTable(pathname, rows=seq(from=1, to=1000, by=4));

# Read only columns 'chromosome' and 'position'. df <- readTable(pathname, colClasses=c("chromosome"="character", "position"="double"), defColClass="NULL", header=TRUE, sep="\t");

# Read 'log2' data chromosome by chromosome chromosome <- readTableIndex(pathname, indexColumn=3, header=TRUE, sep="\t") for (cc in unique(chromosome)) {
  rows <- which(chromosome == cc);
  df <- readTable(pathname, rows=rows, colClasses=c("log2"="double"), defColClass="NULL", header=TRUE, sep="\t");   ...
}

Cheers

Henrik

On 7/27/06, Prof Brian Ripley <ripley@stats.ox.ac.uk> wrote:
> On Thu, 27 Jul 2006, jz7@duke.edu wrote:
>
> > Dear all,
> >
> > I am reading the data using "read.table". However, there are a few rows I
> > want to skip. How can I do that in an easy way? Suppose I know the row
> > number that I want to skip. Thanks so much!
>
> The easy way is to read the whole data frame and using indexing (see `An
> Introduction to R') to remove the rows you do not want to retain.
> E.g. to remove rows 17 and 137
>
> mydf <- read.table(...)[-c(17, 137), ]
>
> --
> Brian D. Ripley, ripley@stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Jul 29 11:22:01 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 29 Jul 2006 - 14:16:20 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.