Re: [Rd] (PR#9796) write.dcf/read.dcf cycle converts missing entry

From: <ripley_at_stats.ox.ac.uk>
Date: Wed, 18 Jul 2007 09:43:25 +0200 (CEST)


BIll,

Thanks.

I am seeing some problems here, for example when all the fields are missing, or all the fields in a row are missing. I've fixes for those, and will commit to R-devel shortly.

On Tue, 17 Jul 2007, bill_at_insightful.com wrote:

> Full_Name: Bill Dunlap
> Version: 2.5.0
> OS: Red Hat Enterprise Linux WS release 3 (Taroon Update 6)
> Submission from: (NULL) (24.17.60.30)
>
>
> If you read a dcf file with read.dcf(file,fields=c("Field",...))
> and the file does not contain the desired field "Field",
> read.dcf puts a character NA for that entry in its output
> matrix. If you then call write.dcf, passing it the output
> of read.dcf(), it will write the entry "Field: NA". A subsequent
> read.dcf() on write.dcf's output file will then have a "NA",
> not a character NA, in the entry for "Field". I think that
> write.dcf() should not write lines in the output file where
> the input matrix contains a character NA.
>
> Here is a test function to demonstrate the problem. It returns
> TRUE when a write.dcf/read.dcf cycle does not change the data.
>
> test.write.dcf <- function () {
> origFile <- tempfile()
> copyFile <- tempfile()
> on.exit(unlink(c(origFile, copyFile)))
> writeLines(c("Package: testA", "Version: 0.1-1", "Depends:", "",
> "Package: testB", "Version: 2.1" , "Suggests: testA", "",
> "Package: testC", "Version: 1.3.1", ""),
> origFile)
> orig <- read.dcf(origFile,
> fields=c("Package","Version","Depends","Suggests"))

> write.dcf(orig, copyFile, width = 72)
> copy <- read.dcf(copyFile,
> fields=c("Package","Version","Depends","Suggests"))
> value <- all.equal(orig, copy)
> if (!identical(value, TRUE)) {
> attr(value, "orig") <- orig
> attr(value, "copy") <- copy
> }
> value
> }
> Currently we get
> > test.write.dcf()
> [1] "'is.NA' value mismatch: 0 in current 4 in target"
> attr(,"orig")
> Package Version Depends Suggests
> [1,] "testA" "0.1-1" "" NA
> [2,] "testB" "2.1" NA "testA"
> [3,] "testC" "1.3.1" NA NA
> attr(,"copy")
> Package Version Depends Suggests
> [1,] "testA" "0.1-1" "" "NA"
> [2,] "testB" "2.1" "NA" "testA"
> [3,] "testC" "1.3.1" "NA" "NA"
> With the attached write.dcf() it returns TRUE.
>
> The diff would be
> 19,22c19,24
> < eor <- character(nr * nc)
> < eor[seq.int(1, nr - 1) * nc] <- "\n"
> < writeLines(paste(formatDL(rep.int(colnames(x), nr), c(t(x)),
> < style = "list", width = width, indent = indent), eor,
> ---
>> tx <- t(x)
>> not.na <- c(!is.na(tx))
>> eor <- character(sum(not.na))
>> eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
>> writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
>> style = "list", width = width, indent = indent)[not.na], eor,
>
> and the entire function would be
>
> `write.dcf` <-
> function (x, file = "", append = FALSE, indent = 0.1 * getOption("width"),
> width = 0.9 * getOption("width"))
> {
> if (!is.data.frame(x))
> x <- data.frame(x)
> x <- as.matrix(x)
> mode(x) <- "character"
> if (file == "")
> file <- stdout()
> else if (is.character(file)) {
> file <- file(file, ifelse(append, "a", "w"))
> on.exit(close(file))
> }
> if (!inherits(file, "connection"))
> stop("'file' must be a character string or connection")
> nr <- nrow(x)
> nc <- ncol(x)
> tx <- t(x)
> not.na <- c(!is.na(tx))
> eor <- character(sum(not.na))
> eor[ c(diff(c(col(tx))[not.na]),0)==1 ] <- "\n"
> writeLines(paste(formatDL(rep.int(colnames(x), nr), c(tx),
> style = "list", width = width, indent = indent)[not.na], eor,
> sep = ""), file)
> }
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed 18 Jul 2007 - 20:56:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 19 Jul 2007 - 05:37:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.