[Rd] read.table() and NULL for colClasses

From: Henrik Bengtsson <hb_at_maths.lth.se>
Date: Thu 29 Jul 2004 - 05:11:29 EST


Hi,

is there are reason for not supporting NULL or "NULL" values for argument colClasses in read.table(), much like you can use NULL values for argument 'what' in scan()? This would help quite a bit when reading large data files where only a few columns are of interest.

I've modfied read.table() to so it calls scan(what=...) also with NULLs for the fields to be skipped. Here's the diff of readtable.R (from the R-1.9.1.tgz; 9,591,217 bytes):

diff readtable.new.R readtable.R
117,123d116
< # Skip NULL columns in scan()
< void <- sapply(colClasses, FUN=identical, "NULL") |
< sapply(colClasses, FUN=is.null)
< # If all (data) columns are NULL, return empty data frame.
< if (sum(!void) <= 1*rlabp)
< return(data.frame())
< what[void] <- list(NULL)

131c124
< nlines <- length(data[[which(!void)[1]]])

---
>     nlines <- length(data[[1]])
161c154

< for (i in (1:cols)[!known & !void]) {
--- > for (i in 1:cols) { 171,178d163
< # Skipped row names equals row.names=NULL.
< if (rlabp) {
< if (void[1]) {
< row.names <- NULL
< data <- data[-1]
< }
< void <- void[-1]
< }
201,202d185
< # Remove NULL columns
< data[void] <- NULL
and a diff for read.table.Rd: diff read.table.new.Rd read.table.Rd 102,104c102
< \code{NA} when \code{\link{type.convert}} is used. Columns for
< which the value is \code{"NULL"} (or \code{NULL} in a list) are
< skipped. NB: \code{as} is
--- > \code{NA} when \code{\link{type.convert}} is used. NB: \code{as} is 181,183c179 < the five atomic vector classes. Skipping columns with \code{"NULL"} < (or \code{NULL} will also require less memory. < --- > the five atomic vector classes. Note that there is already an, what I assume is unintentional, effect of setting a colClasses to "NULL". The data conversion, which happens *after* scan() has read the data anyway, "NULL" will NULL a column via as(x, "NULL"), but unfortunately the wrong column. If not the above modifications, maybe a warning for the latter? Best wishes Henrik Bengtsson Dept. of Mathematical Statistics @ Centre for Mathematical Sciences Lund Institute of Technology/Lund University, Sweden (+2h UTC) +46 46 2229611 (off), +46 708 909208 (cell), +46 46 2224623 (fax) h b @ m a t h s . l t h . s e, http://www.maths.lth.se/~hb/ ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Received on Thu Jul 29 05:26:14 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 08:59:14 EST