Re: [R] gsub issue in R 2.11.1, but not present in 2.9.2

From: Bert Gunter <gunter.berton_at_gene.com>
Date: Tue, 29 Jun 2010 11:07:32 -0700

Jason:

I think it's actually even a bit worse than what Duncan said, which was:



"You need to double the backslashes to enter them in an R string. So

gsub("N\\A", "NA", original, fixed=TRUE)

should work if original contains a single backslash, and

gsub("N\\\\A", "NA", original, fixed=TRUE)

should work if it contains a double one. Two things add to the confusion here: First, a single backslash will be displayed doubled by print(). .. "


Well, let's see: (On R version 2.11.1, 2010-5-31 for Windows)

> astring <- "n\a"
> print(astring)

[1] "n\a"

So Duncan's last sentence appears to be incorrect. The "\" is not displayed doubled. However ...

> bstring <- "N\A"

Error: '\A' is an unrecognized escape in character string starting "N\A"

What's going on? Well, the "\a" in astring is a _single escape sequence (for a beep/bell sound, on Windows anyway: cat("\a") should make a sound). So the "\" in "\a" is printed as correctly undoubled. However, since the "\A" in bstring does _not_ correspond to any escape sequence, the expression "\A" cannot be parsed and an error is thrown. But:

> bstring <- "N\\A"
> print(bstring)

[1] "N\\A" ## is fine

## ... Noting that

> nchar("\\A")

[1] 2

So whether a "\" needs to be doubled or not depends on whether the parser can interpret it as part of a legitimate escape sequence, whence

gsub("\a","","\a") ## works but
gsub("\A","","\A") ## does not.

To avoid such confusion, I think Duncan's advice to double backslashes should be heeded as much as possible. Unfortunately, I don't think it's always possible:

> newlineString <- "first line\nsecond line\n"
> print(newlineString)

[1] "first line\nsecond line\n"
> cat(newlineString)

first line
second line

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Statistics

> -----Original Message-----
> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org]
> On Behalf Of Uwe Ligges
> Sent: Tuesday, June 29, 2010 4:11 AM
> To: Jason Rupert
> Cc: r-help_at_r-project.org
> Subject: Re: [R] gsub issue in R 2.11.1, but not present in 2.9.2
>
>
>
> On 29.06.2010 12:47, Jason Rupert wrote:
> > Previously in R 2.9.2 I used the following to convert from an improperly
> formatted NA string into one that is a bit more consistent.
> >
> >
> > gsub("N\A", "NA", "N\A", fixed=TRUE)
> >
> > This worked in R 2.9.2, but now in R 2.11.1 it doesn't seem to work an
> throws the following error.
> > Error: '\A' is an unrecognized escape in character string starting "N\A"
> >
> > I guess my questions are the following:
> > (1) Is this expected behavior?
> > (2) If it is expected behavior, what is the proper way to replace "N\A"
> with "NA" and "N\\A" with "NA"?
>
>
> If your original text "thestring" contains "N\A", then the R
> representation is "N\\A", and hence
>
> gsub("N\\A", "NA", thestring)
>
> If you want to try explicitly, you need to write
>
> gsub("N\\A", "NA", "N\\A")
>
> If you original text contains two backslashes, both have to be escaped as
> in
>
> gsub("N\\\\A", "NA", thestring)
>
> Uwe Ligges
>
>
> > Thank you again for all the help and insight.
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 29 Jun 2010 - 18:09:27 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 29 Jun 2010 - 19:10:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive