Re: [R] Ignoring initial rows in a text file import

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Mon, 31 May 2010 20:13:13 -0400

On May 31, 2010, at 7:51 PM, Kevin Burnham wrote:

> I am trying to import a series of text files generated by stimulus
> presentation software. The problem that I am having is that the
> number of
> rows I need to skip is not fixed (depending on subject's pretest
> behavior)
> nor is the first row of the data I want always the same (the stimuli
> were
> presented in random order). So I need to bring in the .txt file
> (using
> readLines?), look for the row containing the text "Begin Main" (see
> exact
> row below) and start reading data to a table from that point.
>
> [13] "Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t
> \tPressed\t(any response)\tC\t25860\t\t\t\t\t"
>
> I would also like it to ignore the row:
> [173] "Main Group\t1000\tBreak\tBreak\
> tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t"
>
> which will always be the same number of rows after the "Begin Main"
> row.

  txt <- "blah
  blahe
  blah
  blah
  Main Group\t1000\tBegin Main\tBegin Main\tBegin Main\t\t   \tPressed\t(any response)\tC\t25860\t\t\t\t\t

  more blah after blank line
  uy
  ytre
  jhgf
  Main Group\t1000\tBreak\tBreak\
  tpause3\t\t \tPressed\t(any response)\tC\t47610\t\t\t\t\t   uytr
  hgfd"

# ___end setup input______________
 > bring.in <- readLines(textConnection(txt))  > grep("\\tBegin Main", bring.in)
[1] 5
 > grep("Main Group\\t1000", bring.in)
[1] 5 12
 > length.vec <- grep("Main Group\\t1000", bring.in)  > length.vec[2] - length.vec[1]
[1] 7
 >

# So a vectorized solution would be:
bring.in[grep("\\tBegin Main", bring.in):(

          grep("\\tBegin Main", bring.in)+length.vec[2] - length.vec[1]-1)]

[2] "\tPressed\t(any response)\tC\t25860\t\t\t\t\t"
[3] ""
[4] "more blah after blank line"
[5] "uy"
[6] "ytre"
[7] "jhgf"
bring.in[grep("\\tBegin Main", bring.in):(
          grep("\\tBegin Main", bring.in)+length.vec[2] -  
length.vec[1]-1)]
-- 


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 01 Jun 2010 - 00:15:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 01 Jun 2010 - 00:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive