[Rd] Behaviour of read.table with empty columns

From: John Fox <jfox_at_mcmaster.ca>
Date: Wed, 09 May 2007 11:10:29 -0400


Dear r-devel list members,

I stumbled across the following behaviour of read.table() recently: Suppose that I have the data

a " " ""
"" "" ""

in a file or copied to the clipboard, and issue the command

> DF <- read.table("clipboard")
> DF

  V1 V2 V3
1 a NA NA
2 NA NA

> is.na(DF)

        V1 V2 V3
[1,] FALSE TRUE TRUE
[2,] FALSE TRUE TRUE I was surprised by the NAs. Note that they occur only when a column consists entirely of empty strings or strings composed of blanks.

On the other hand

> data.frame(A=c("", "", ""))
  A
1
2
3

works as I would have expected.

A work-around for me was

> DF[is.na(DF)] <- ""
> DF

  V1 V2 V3

1  a      
2         

But, as I said, I found the behaviour of read.table() puzzling.

All this is with R 2.5.0 on a Windows XP Pro SP 2 system.

Comments?

Thanks,
 John



John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 09 May 2007 - 15:18:37 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 09 May 2007 - 16:33:45 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.