Re: [R] Skipping specified rows in scan or read.table

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Wed, 09 Apr 2008 20:43:26 +0100 (BST)

On Wed, 9 Apr 2008, Ravi Varadhan wrote:

> Hi,
>
>
>
> I have a data file, certain lines of which are character fields. I would
> like to skip these rows, and read the data file as a numeric data frame. I
> know that I can skip lines at the beginning with read.table and scan, but is
> there a way to skip a specified sequence of lines (e.g., 1, 2, 10, 11, 19,
> 20, 28, 29, etc.) ?

Not within scan, but you can do it within the connection that scan reads.

If the file is small, just read it all with readLines, select the lines you want (mydata[-c(1,2,10,11...)]) and use that as the input to a textConnection. If it is large, read a line at a time, discard when it is one to be skipped otherwise write to an anonymous file() connection. Then read.table on the anonymous connection.

Or use perl/awk within a pipe() connection.

> If I read the entire data file, and then delete the character fields, the
> values are still kept as factors, with each value denoted by its level.
> Since, I have continuous variables, there are as many levels as there are
> values. I am unable to coerce this to "numeric" mode. Is there a way to do
> this so that I can then manipulate the numeric data frame?

Why does FAQ Q7.10 not apply?

>
>
>
> Thanks for any help.
>
> Best,
>
> Ravi.
>
> ----------------------------------------------------------------------------
> -------
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvaradhan_at_jhmi.edu
>
> Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
> ----------------------------------------------------------------------------
> --------
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 09 Apr 2008 - 19:46:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 Apr 2008 - 20:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive