Re: [R] regular expression in gsub() for strings with leading backslash

From: Mike Miller <mbmiller+l_at_gmail.com>
Date: Fri, 29 Apr 2011 20:46:06 -0500 (CDT)

On Fri, 29 Apr 2011, Duncan Murdoch wrote:

> On 29/04/2011 7:41 PM, Miao wrote:
>
>> Can anyone help on gsub() in R? I have a string like something below, and
>> wanted to delete all the strings with leading backslash, including
>> "\xa0On",
>> "\023, "\xab", and many others. How should I write a regular expression
>> pattern in gsub()? I don't care how many characters following backslash.
>
>
> If those are R strings, none of them contain a backslash. In R, a backslash
> would always be printed as \\.
>
> \x is the introduction to a hexadecimal encoding for a character; the next
> two characters show the hex digits. So your first string contains a single
> character \xa0, the third one contains \xab, and so on.
>
> The \023 is an octal encoding for a single character.

If we were dealing with a leading backslash, I guess this would do it:

gsub("^\\\\.*", "", txt)

R would display a double backslash, but I believe that represents a single backslash. So if the string were saved using write.table, say, only a single backslash would be stored.

> a <- "\\This is a string."
> a

[1] "\\This is a string."
> gsub("^\\\\", "", a)

[1] "This is a string."
> a

[1] "\\This is a string."
> gsub("^\\\\.*", "", a)

[1] ""
> gsub("^\\\\.*", "", c(a,"Another string","\\more"))
[1] "" "Another string" ""
> write.table(a, file="a.txt", quote=F, row.names=F, col.names=F)

$ cat a.txt
\This is a string.

Apparently this is not what the OP really wanted. The OP probably wanted to remove characters that were not from the regular ASCII set.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 30 Apr 2011 - 01:50:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 30 Apr 2011 - 04:00:36 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive