Re: [R] Incremental ReadLines

From: Freds <>
Date: Wed, 13 Apr 2011 10:57:58 -0700 (PDT)

Hi there,

I am having a similar problem with reading in a large text file with around 550.000 observations with each 10 to 100 lines of description. I am trying to parse it in R but I have troubles with the size of the file. It seems like it is slowing down dramatically at some point. I would be happy for any suggestions. Here is my code, which works fine when I am doing a subsample of my dataset.

#Defining datasource

file <- "filename.txt"

#Creating placeholder for data and assigning column names
data <- data.frame(Id=NA)

#Starting by case = 0

case <- 0

#Opening a connection to data

input <- file(file, "rt")

#Going through cases

repeat {
  line <- readLines(input, n=1)
  if (length(line)==0) break
  if (length(grep("Id:",line)) != 0) {
    case <- case + 1 ; data[case,] <-NA
    split_line <- strsplit(line,"Id:")
    data[case,1] <- as.numeric(split_line[[1]][2])     }

#Closing connection


#Saving dataframe


Kind regards,


View this message in context:
Sent from the R help mailing list archive at

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 14 Apr 2011 - 06:12:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Apr 2011 - 06:20:30 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive