[Rd] Embedded nuls in strings

From: Herve Pages <hpages_at_fhcrc.org>
Date: Tue, 07 Aug 2007 14:06:56 -0700



     'rawToChar' converts raw bytes either to a single character string
     or a character vector of single bytes.  (Note that a single
     character string could contain embedded nuls.)

Allowing embedded nuls in a string might be an interesting experiment but it seems to cause some troubles to most of the string manipulation functions.

A string with an embedded 0:

  raw0 <- as.raw(c(65:68, 0 , 70))
  string0 <- rawToChar(raw0)

> string0

[1] "ABCD\0F" nchar() should return 6:
> nchar(string0)

[1] 4

In addition this embedded nul seems to break almost all string manipulation/searching functions:
  grep("F", string0)
  strsplit(string0, split=NULL, fixed=TRUE)[[1]]   tolower(string0)
  chartr("F", "x", string0)
  substr(string0, 6, 6)

Not very surprisingly, they all seem to treat string0 as if it was "ABCD"!


R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 08 Aug 2007 - 04:45:28 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 08 Aug 2007 - 08:37:51 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.