Re: [R] gsub issue in R 2.11.1, but not present in 2.9.2

From: Nordlund, Dan (DSHS/RDA) <NordlDJ_at_dshs.wa.gov>
Date: Tue, 29 Jun 2010 11:55:46 -0700

> -----Original Message-----
> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
> project.org] On Behalf Of Bert Gunter
> Sent: Tuesday, June 29, 2010 11:08 AM
> To: 'Jason Rupert'; 'Duncan Murdoch'
> Cc: r-help_at_r-project.org
> Subject: Re: [R] gsub issue in R 2.11.1, but not present in 2.9.2
>
> Jason:
>
> I think it's actually even a bit worse than what Duncan said, which
> was:
>
> -----------
> "You need to double the backslashes to enter them in an R string. So
>
> gsub("N\\A", "NA", original, fixed=TRUE)
>
> should work if original contains a single backslash, and
>
> gsub("N\\\\A", "NA", original, fixed=TRUE)
>
> should work if it contains a double one. Two things add to the
> confusion
> here: First, a single backslash will be displayed doubled by print().
> .. "
> ------
>
> Well, let's see: (On R version 2.11.1, 2010-5-31 for Windows)
>
> > astring <- "n\a"
> > print(astring)
> [1] "n\a"
>
> So Duncan's last sentence appears to be incorrect. The "\" is not
> displayed
> doubled. However ...

But Duncan's statement is correct. In your example above, there is no backslash character in the variable astring. It contains the letter 'n' and the control character '\a', which is a single character (the backslash is printed by print() to indicated the control character). If there was actually a backslash character in the string, print() would have doubled.   

>
> > bstring <- "N\A"
> Error: '\A' is an unrecognized escape in character string starting
> "N\A"
>
> What's going on? Well, the "\a" in astring is a _single escape sequence
> (for
> a beep/bell sound, on Windows anyway: cat("\a") should make a sound).
> So the
> "\" in "\a" is printed as correctly undoubled. However, since the "\A"
> in
> bstring does _not_ correspond to any escape sequence, the expression
> "\A"
> cannot be parsed and an error is thrown. But:
>
> > bstring <- "N\\A"
> > print(bstring)
> [1] "N\\A" ## is fine
>
> ## ... Noting that
>
> > nchar("\\A")
> [1] 2
>
> So whether a "\" needs to be doubled or not depends on whether the
> parser
> can interpret it as part of a legitimate escape sequence, whence
>
> gsub("\a","","\a") ## works but
> gsub("\A","","\A") ## does not.

Whether "\" needs to be doubled depends on what you want the string value to be. If you want the single control character, '\a', then you don't want to double it. If you want the string to contain 2 characters '\' and 'a', then you must enter '\\a'.

>
> To avoid such confusion, I think Duncan's advice to double backslashes
> should be heeded as much as possible. Unfortunately, I don't think it's
> always possible:

In this case, if you actually want a newline character, then you don't want to use a double backslash.

>
> > newlineString <- "first line\nsecond line\n"
> > print(newlineString)
> [1] "first line\nsecond line\n"
> > cat(newlineString)
> first line
> second line
>
> Cheers,

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division
Olympia, WA 98504-5204



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 29 Jun 2010 - 18:58:05 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 29 Jun 2010 - 19:00:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive