Re: [Rd] Bug in read.table?

From: Charles C. Berry <cberry_at_tajo.ucsd.edu>
Date: Fri, 05 Nov 2010 17:17:57 -0700

On Fri, 5 Nov 2010, jgarcia_at_ija.csic.es wrote:

> Hi,
>
> I'm writting to this list as I'm puzzled about the behaviour of
> read.table(). It is hard to believe that there is a bug in this utils'
> function, but for my:
>
> R version 2.12.0 alpha (2010-09-28 r53056)
>
> I'm using scan and read.table to read a number of files, which are as:
>

There are line wraps here, so we can't just cut-and-paste.

> ---
>
> Project: Murta Sonda
> Program: GrafNav Version 8.30.1007
> Profile: javier
> Source: GPS Epochs(Combined)
> ProcessInfo: Run (1) by Unknown on 11/04/2010 at 19:05:17
>
> Datum: WGS84, (processing datum)
> Master 1: Name LaMurta, Status ENABLED
> Antenna height 2.066 m, to L1-PC (NOV702GG, MeasDist 1.980 m
> to mark/ARP)
> Position 37 49 38.15069, -1 12 27.55445, 368.197 m (WGS84,
> Ellipsoidal hgt)
> Remote: Antenna height 1.781 m, to L1-PC (NOV702GG, MeasDist 1.695 m
> to mark/ARP)
> UTC Offset: 15 s
> Local time: +2.0 h, CEST [Central European Savings Time]
> Geoid: EGM2008-World.wpg (Absolute correction)
>
> Latitude Longitude LonTextLoTextLongitudTextL
> LatTextLaTextLatitudeTextL H-Ell H-MSL LocalUTCDa
> LocalUTC
> (Deg) (Deg) (DeMi (Sec) (DeMi (Sec) (m)
> (m) (DMY) (HMS)
> 37.8275120694 -1.2077972583 00112'28.07013"W 03749'39.04345"N
> 368.998 318.059 25/10/2010 16:59:00
> 37.8275121083 -1.2077974806 00112'28.07093"W 03749'39.04359"N
> 368.994 318.055 25/10/2010 16:59:15
> 37.8275118539 -1.2077974338 00112'28.07076"W 03749'39.04267"N
> 368.997 318.058 25/10/2010 16:59:30
> 37.8275119923 -1.2077974626 00112'28.07087"W 03749'39.04317"N
> 368.998 318.060 25/10/2010 16:59:45
> 37.8275323099 -1.2078075891 00112'28.10732"W 03749'39.11632"N
> 368.869 317.930 25/10/2010 17:00:00
> 37.8275323374 -1.2078077002 00112'28.10772"W 03749'39.11641"N
> 368.866 317.927 25/10/2010 17:00:15
> 37.8275325076 -1.2078075314 00112'28.10711"W 03749'39.11703"N
> 368.859 317.920 25/10/2010 17:00:30
> 37.8275325306 -1.2078075056 00112'28.10702"W 03749'39.11711"N
> 368.861 317.922 25/10/2010 17:00:45
> 37.8275323639 -1.2078075917 00112'28.10733"W 03749'39.11651"N
> 368.853 317.914 25/10/2010 17:01:00
> 37.8275326222 -1.2078076861 00112'28.10767"W 03749'39.11744"N
> 368.857 317.918 25/10/2010 17:01:15
> ---
>

Uh, what about those quotes??

Using quote = '' yields 'dat' sans duplicates.

I'll leave it to others to decide if this is a bug.

> with a number of different records for each file.
>
> To read the data I'm using:
>
> ---
> dat.names <- scan(file.path("path_and_filename"),
> what="character",
> skip = 16, nlines=1)
> if(length(dat.names) != 8){
> stop("Input file seems to be wrong!")}
>
> dat <- read.table(file.path("path_and_filename),
> header=FALSE, col.names=dat.names,
> skip = 18, as.is=TRUE, blank.lines.skip=FALSE)
> ---
> and systematically, I'm obtaining a number of repeated records at the
> starting of the input table (6 in this example). It is easily seen by
> looking at the field "LocalUTC":

Or looking at duplicated(dat)

HTH, Chuck

>
>> dat
> Latitude Longitude LonTextLoTextLongitudTextL
> LatTextLaTextLatitudeTextL H.Ell H.MSL LocalUTCDa LocalUTC
> 1 37.82753 -1.207808 00112'28.10732"W
> 03749'39.11632"N 368.869 317.930 25/10/2010 17:00:00
> 2 37.82753 -1.207808 00112'28.10772"W
> 03749'39.11641"N 368.866 317.927 25/10/2010 17:00:15
> 3 37.82753 -1.207808 00112'28.10711"W
> 03749'39.11703"N 368.859 317.920 25/10/2010 17:00:30
> 4 37.82753 -1.207808 00112'28.10702"W
> 03749'39.11711"N 368.861 317.922 25/10/2010 17:00:45
> 5 37.82753 -1.207808 00112'28.10733"W
> 03749'39.11651"N 368.853 317.914 25/10/2010 17:01:00
> 6 37.82753 -1.207808 00112'28.10767"W
> 03749'39.11744"N 368.857 317.918 25/10/2010 17:01:15
> 7 37.82751 -1.207797 00112'28.07013"W
> 03749'39.04345"N 368.998 318.059 25/10/2010 16:59:00
> 8 37.82751 -1.207797 00112'28.07093"W
> 03749'39.04359"N 368.994 318.055 25/10/2010 16:59:15
> 9 37.82751 -1.207797 00112'28.07076"W
> 03749'39.04267"N 368.997 318.058 25/10/2010 16:59:30
> 10 37.82751 -1.207797 00112'28.07087"W
> 03749'39.04317"N 368.998 318.060 25/10/2010 16:59:45
> 11 37.82753 -1.207808 00112'28.10732"W
> 03749'39.11632"N 368.869 317.930 25/10/2010 17:00:00
> 12 37.82753 -1.207808 00112'28.10772"W
> 03749'39.11641"N 368.866 317.927 25/10/2010 17:00:15
> 13 37.82753 -1.207808 00112'28.10711"W
> 03749'39.11703"N 368.859 317.920 25/10/2010 17:00:30
> 14 37.82753 -1.207808 00112'28.10702"W
> 03749'39.11711"N 368.861 317.922 25/10/2010 17:00:45
> 15 37.82753 -1.207808 00112'28.10733"W
> 03749'39.11651"N 368.853 317.914 25/10/2010 17:01:00
> 16 37.82753 -1.207808 00112'28.10767"W
> 03749'39.11744"N 368.857 317.918 25/10/2010 17:01:15
>
> Thanks,
>
> Javier
> ---
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Charles C. Berry                            Dept of Family/Preventive Medicine
cberry_at_tajo.ucsd.edu			    UC San Diego

http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sat 06 Nov 2010 - 00:22:19 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 07 Nov 2010 - 22:00:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive