Re: [R] a simple problem

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Fri, 04 Mar 2011 11:03:58 -0500

On Mar 4, 2011, at 9:50 AM, Asan Ramzan wrote:

> Hello R-help
>
> I am working with large data table that have the occasional label,
> a particular time point in an experiment. E.g:
>
> "Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1"
> .909, 1.117, 1.225, 1.048, 1.258
> 3.942, 1.113, 1.230, 1.049, 1.262
> 3.976, 1.105, 1.226, 1.051, 1.259
> 4.009, 1.114, 1.231, 1.053, 1.259
> 4.042, 1.107, 1.230, 1.048, 1.262
> 4.076, 1.108, 1.226, 1.045, 1.257
> 4.109, 1.109, 1.227, 1.047, 1.259
> 4.142, 1.108, 1.225, 1.052, 1.260
> 4.176, 1.105, 1.222, 1.046, 1.260
> 4.209, 1.106, 1.226, 1.050, 1.258
> 4.242, 1.105, 1.224, 1.047, 1.258
> 4.276, 1.104, 1.223, 1.048, 1.259
> 4.309, 1.106, 1.228, 1.050, 1.260
> 4.342, 1.103, 1.219, 1.049, 1.260
> 4.376, 1.107, 1.225, 1.052, 1.259
> 4.409, 1.105, 1.222, 1.047, 1.258
> 4.442, 1.106, 1.227, 1.048, 1.262
> 4.476, 1.105, 1.222, 1.049, 1.261
> 4.509, 1.102, 1.222, 1.047, 1.259
> 4.555, "Gly sar"
> 4.555, 1.107, 1.224, 1.048, 1.261
> 4.576, 1.109, 1.228, 1.053, 1.259
> 4.609, 1.103, 1.218, 1.046, 1.258
> 4.642, 1.105, 1.223, 1.048, 1.256
> 4.676, 1.108, 1.217, 1.048, 1.260
> 4.709, 1.124, 1.222, 1.047, 1.258
> When I try to read in the table, I get:
>> try<-read.table("200810_01.R",header=T,sep=",")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
> na.strings, :
> line 136 did not have 5 elements
>
> Is there any way to tell R to ignore these labels or better
> still interpret them as being label for particular time
> points, so when it comes to draw a line graph it is annotated
> with these labels.

Option 1:
Prepare your data properly with an editor:

Option 2:
You could read the file with readLines, identify the offending lines with grep or grepl, then separate the offenders and non-offenders. lines <- readLines(textConnection('"Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1"
.909, 1.117, 1.225, 1.048, 1.258

3.942, 1.113, 1.230, 1.049, 1.262
3.976, 1.105, 1.226, 1.051, 1.259
4.009, 1.114, 1.231, 1.053, 1.259
4.042, 1.107, 1.230, 1.048, 1.262
4.076, 1.108, 1.226, 1.045, 1.257
4.109, 1.109, 1.227, 1.047, 1.259
4.142, 1.108, 1.225, 1.052, 1.260
4.176, 1.105, 1.222, 1.046, 1.260
4.209, 1.106, 1.226, 1.050, 1.258
4.242, 1.105, 1.224, 1.047, 1.258
4.276, 1.104, 1.223, 1.048, 1.259
4.309, 1.106, 1.228, 1.050, 1.260
4.342, 1.103, 1.219, 1.049, 1.260
4.376, 1.107, 1.225, 1.052, 1.259
4.409, 1.105, 1.222, 1.047, 1.258
4.442, 1.106, 1.227, 1.048, 1.262
4.476, 1.105, 1.222, 1.049, 1.261
4.509, 1.102, 1.222, 1.047, 1.259
4.555, "Gly sar"
4.555, 1.107, 1.224, 1.048, 1.261
4.576, 1.109, 1.228, 1.053, 1.259
4.609, 1.103, 1.218, 1.046, 1.258
4.642, 1.105, 1.223, 1.048, 1.256
4.676, 1.108, 1.217, 1.048, 1.260
4.709, 1.124, 1.222, 1.047, 1.258'))

  read.table(textConnection(
         lines[ c(TRUE, !grepl("[[:alpha:]]", lines)[-1]) ]),
              skip=1)

  # the quotes and spaces don't work well with R column naming conventions

        V1 V2 V3 V4 V5

1   .909, 1.117, 1.225, 1.048, 1.258
2  3.942, 1.113, 1.230, 1.049, 1.262
3  3.976, 1.105, 1.226, 1.051, 1.259

snipped

23 4.642, 1.105, 1.223, 1.048, 1.256
24 4.676, 1.108, 1.217, 1.048, 1.260
25 4.709, 1.124, 1.222, 1.047, 1.258

So even more compact would be:

read.table(textConnection(

         lines[ !grepl("[[:alpha:]]", lines) ] ) )

Using the non-negated grepl expression should get you all the "labels" lines

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 04 Mar 2011 - 16:08:40 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 04 Mar 2011 - 16:10:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive