R-alpha: bug and fix for read.table

Gregory R. Warnes (warnes@biostat.washington.edu)
Fri, 6 Sep 1996 12:18:18 -0700 (PDT)


Date: Fri, 6 Sep 1996 12:18:18 -0700 (PDT)
From: "Gregory R. Warnes" <warnes@biostat.washington.edu>
To: r-testers <r-testers@stat.math.ethz.ch>
Subject: R-alpha: bug and fix for read.table


First, thanks to Robert Gentleman + Ross Ihaka for R.  The frequency of 
my bug reports is an indication of the usefulness of R, not of my 
disapproval!

OK, there is a bug in read.table when called as
read.table("file",header=T,skip=n): the skip parameter is not supplied to
the call to scan() that reads the column names, hence the column names 
are read from the first line of the file, rather than from line skip+1 of 
the file.  This is solved by addint the skip parameter to the appropriate 
call.  The corrected R source is appended at the end of this message.

Greg

-------------------------------------------------------------------------------
    Gregory R. Warnes          | It is high time that the ideal of success
warnes@biostat.washington.edu  |  be replaced by the ideal of service.
                               |                       Albert Einstein
-------------------------------------------------------------------------------

"read.table" <-
function (file, header = FALSE, sep = "", row.names, col.names, as.is = F
	, na.strings = "NA", skip = 0) 
{
	col.lens <- count.fields(file, sep, skip)
	nlines <- length(col.lens)
	rlabp <- 0
	if (nlines > 1 && (col.lens[2] - col.lens[1]) == 1) {
		if (missing(header)) 
			header <- TRUE
		rlabp <- 1
	}
	if (header) {
		col.names <- scan(file, sep, nlines = 1, quiet = TRUE, skip=skip)
		skip <- skip + 1
		col.lens <- col.lens[-1]
		nlines <- nlines - 1
	}
	else if (missing(col.names)) 
		col.names <- paste("Var", 1:col.lens[1])
	cols <- unique(col.lens)
	if (length(cols) != 1) 
		stop("all rows must have the same length", 
paste(col.lens, sep = ",")
		)
	what <- rep(list(""), cols)
	if (rlabp == 1) 
		col.names <- c("rownames", col.names)
	names(what) <- col.names
	data <- scan(file = file, what = what, sep = sep, n = nlines * 
cols, skip=skip
		, na.strings = na.strings, quiet = TRUE)
	if (missing(row.names)) {
		if (rlabp == 1) {
			row.names <- data[[1]]
			data <- data[-1]
		}
		else row.names <- as.character(1:nlines)
	}
	else if (is.null(row.names)) 
		row.names <- as.character(1:nlines)
	else if (is.character(row.names)) {
		if (length(row.names) == 1) {
			rowvar <- (1:cols)[match(col.names, row.names, 0) 
== 1]
			row.names <- data[[rowvar]]
			data <- data[-rowvar]
		}
	}
	else if (is.numeric(row.names) && length(row.names) == 1) {
		row.names <- data[[row.names]]
		data <- data[-row.names]
	}
	else stop("invalid row.names specification")
	cols <- length(data)
	if (length(as.is) == 1) 
		as.is <- rep(as.is, cols)
	if (length(as.is) != cols) 
		stop("as.is is the wrong length")
	for (i in 1:cols) {
		if (!as.is[i]) 
			data[[i]] <- numeric.or.factor(data[[i]], 
na.strings = na.strings)
	}
	row.names(data) <- row.names
	return(data)
}

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-testers mailing list -- To (un)subscribe, send
subscribe	or	unsubscribe
(in the "body", not the subject !)  To: r-testers-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-