Re: [R] Regex for Special Characters under Grep

From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
Date: Fri, 13 Jun 2008 07:29:58 +0100 (BST)

On Thu, 12 Jun 2008, Henrik Bengtsson wrote:

> A regular set is given by "[<set>]". The complementary set is given
> by "[^<set>]" where <set> is a set of symbols. I don't think you have
> to escape symbols in <set> (but I might be wrong).

This covered in ?regexp. The metacharacters in character classes (the official name for your 'regular set') are ^]-\.

> In any case, this does what you want:
>
>> lines <- c("abc", "!abc", "#abc", "^abc", " #abc")
>> pattern <- "^[^!#^]";
>> grep(pattern, lines, value=TRUE)
> [1] "abc" " #abc"
>
> /Henrik
>
>
> On Thu, Jun 12, 2008 at 8:06 PM, Marc Schwartz
> <marc_schwartz_at_comcast.net> wrote:
>> on 06/12/2008 08:42 PM Gundala Viswanath wrote:
>>>
>>> Hi all,
>>>
>>> I am trying to capture lines of a file that DO NOT
>>> start with the following header: !, #, ^
>>>
>>> But somehow my regex used under grep doesn't
>>> work.
>>>
>>> Please advice what's wrong with my code below.
>>>
>>> __BEGIN__
>>> in_fname <- paste("mydata.txt,".soft",sep="")
>>> data_for_R <- paste("data_for_R/", args[3], ".softR", sep="")
>>>
>>> # my regex construction
>>> cat(temp[-grep("^[\^\!\#]",temp,perl=TRUE)], file=data_for_R, sep="\n")
>>>
>>>
>>> dat <- read.table(data_for_R)
>>> ___END__
>>>
>>
>> You need to double the escape character when being used to differentiate
>> meta-characters in a regex. Note also that the only meta-character in your
>> sequence is the carat ('^').
>>
>> Lines <- c("! Not This Line", "# Not This Line", "^ Not This Line",
>> "This Line")
>>
>>> Lines
>> [1] "! Not This Line" "# Not This Line" "^ Not This Line"
>> [4] "This Line"
>>
>>> grep("^[!#\\^]", Lines)
>> [1] 1 2 3
>>
>>> Lines[-grep("^[!#\\^]", Lines)]
>> [1] "This Line"
>>
>>
>> HTH,
>>
>> Marc Schwartz
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley_at_stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 13 Jun 2008 - 08:25:33 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 13 Jun 2008 - 09:31:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive