Re: [Rd] grep with fixed=TRUE and ignore.case=TRUE

From: Petr Savicky <savicky_at_cs.cas.cz>
Date: Fri, 11 May 2007 17:33:37 +0200

On Wed, May 09, 2007 at 06:41:23AM +0100, Prof Brian Ripley wrote:

> I suggest you collaborate with the person who replied that he thought this 
> was a good idea to supply patches against the R-devel sources for 
> scrutiny.

A possible solution is to use strncasecmp instead of strncmp in function fgrep_one in R-devel/src/main/character.c.

Corresponding modification of character.c is at

  http://www.cs.cas.cz/~savicky/ignore_case/character.c
and diff file w.r.t. the original character.c (downloaded today) is at
  http://www.cs.cas.cz/~savicky/ignore_case/diff.txt

This seems to work in my installation of R-devel:

> x <- c("D.G cat", "d.g cat", "dog cat")
> z <- "d.g"
> grep(z, x, ignore.case = F, fixed = T)
  [1] 2
> grep(z, x, ignore.case = T, fixed = T) # this is the new behavior
  [1] 1 2
> grep(z, x, ignore.case = T, fixed = F)
  [1] 1 2 3
>

Since fgrep_one is used many times in character.c, adding igcase_opt as an additional argument would imply extensive changes to the file. So, I introduced a new function fgrep_one_igcase called only once in the file. Another solution is possible.

I do not understand well handling multibyte chars, so I did not test the function with real multibyte chars, although the code for this option is used.

Ignore case option is not meaningfull in gsub. It could be meaningful in regexpr, however, this function does not allow ignore.case option, so I did no changes to it.

All the best, Petr.



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 11 May 2007 - 15:36:58 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 May 2007 - 09:33:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.