Re: [R] Parsing regular expressions differently - feature request

From: Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk_at_idi.ntnu.no>
Date: Sat, 08 Nov 2008 21:16:37 +0100

Duncan Murdoch wrote:

> On 08/11/2008 11:03 AM, Gabor Grothendieck wrote:
>> On Sat, Nov 8, 2008 at 9:41 AM, Duncan Murdoch <murdoch_at_stats.uwo.ca>
>> wrote:
>>> On 08/11/2008 7:20 AM, John Wiedenhoeft wrote:

>>>> Hi there,
>>>>
>>>> I rejoiced when I realized that you can use Perl regex from within R.
>>>> However, as the FAQ states "Some functions, particularly those
>>>> involving
>>>> regular expression matching, themselves use metacharacters, which
>>>> may need
>>>> to be escaped by the backslash mechanism. In those cases you may
>>>> need a
>>>> quadruple backslash to represent a single literal one. "
>>>>
>>>> I was wondering if that is really necessary for perl=TRUE? wouldn't
>>>> it be
>>>> possible to parse a string differently in a regex context, e.g.
>>>> automatically insert \\ for each \ , such that you can use the perl
>>>> syntax
>>>> directly? For example, if you want to input a newline as a
>>>> character, you
>>>> would use \n anyway. At the moment one says \\n to make it clear to
>>>> R that
>>>> you mean \n to make clear that you mean newline... this is pretty
>>>> annoying.
>>>> How likely is it that you want to pass a real newline character to
>>>> PCRE
>>>> directly?
>>> No, that's not possible.  At the level where the parsing takes place
>>> R has
>>> no idea of its eventual use, so it can't tell that some strings are
>>> going to
>>> be interpreted as Perl, and others not.
Here's a quick hack to achieve the impossible:

mygrep = function(pattern, text, perl=FALSE, ...) {

   if (perl) pattern = gsub("\\\\", "\\\\\\\\", pattern)    grep(pattern, text, perl=perl, ...)
}

(text = "lemme \\ it")
# [1] "lemme \\ it"

nchar(text)
# [1] 10

(pattern = "\\")
# [1] "\\"
nchar(pattern)
# [1] 1

grep(pattern, text, perl=TRUE)
# can't go, impossible!

mygrep(pattern, text, perl=TRUE, value=TRUE) # [1] "lemme \\ it"

vQ



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 08 Nov 2008 - 20:25:00 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 08 Nov 2008 - 22:30:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive