Re: [Rd] grep and PCRE fun from Simon Urbanek on 2011-10-01 (R devel archive)

Re: [Rd] grep and PCRE fun

From: Simon Urbanek <simon.urbanek_at_r-project.org>
Date: Fri, 30 Sep 2011 11:04:21 -0400

Jeff,

this is really a bug in PCRE since the length (0) is a multiple of 3 as documented so PCRE should not be writing anything. Anyway, this has been now fixed (by Brian).

Cheers,
Simon

On Sep 29, 2011, at 5:00 PM, Jeffrey Horner wrote:

> Hello,
>
> I think I've found a bug in the C function do_grep located in
> src/main/grep.c. It seems to affect both the latest revisions of
> R-2-13-branch and trunk when compiling R without optimizations and
> with it's own version of pcre located in src/extra, at least on ubuntu
> 10.04.
>
> According to the pcre_exec API (I presume the later versions), the
> ovecsize argument must be a multiple of 3 , and the ovector argument
> must point to a location that can hold at least ovecsize integers. All
> the pcre_exec calls made by do_grep, save one, honors this. That one
> call seems to overwrite areas of the stack it shouldn't. Here's the
> smallest example I found that tickles the bug:
>

>> grep("[^[:blank][:cntrl]]","\\n",perl=TRUE)

> Error in grep("[^[:blank][:cntrl]]", "\\n", perl = TRUE) :
> negative length vectors are not allowed
>
> As described above, this error occurs on ubuntu 10.04 when R is
> compiled without optimizations ( I typically use CFLAGS="-ggdb"
> CXXFLAGS="-ggdb" FFLAGS="-ggdb" ./configure --enable-R-shlib), and the
> pcre_exec call executed from do_get overwrites the integer nmatches
> and sets it to -1. This has the effect of making do_grep try and
> allocate a results vector of length -1, which of course causes the
> error message above.
>
> I'd be interested to know if this bug happens on other platforms.
>
> Below is my simple fix for R-2-13-branch (a similar fix works for
> trunk as well).
>
> Jeff
>
> $ svn diff main/grep.c
> Index: main/grep.c
> ===================================================================
> --- main/grep.c (revision 57110)
> +++ main/grep.c (working copy)
> @@ -723,7 +723,7 @@
> {
> SEXP pat, text, ind, ans;
> regex_t reg;
> - int i, j, n, nmatches = 0, ov, rc;
> + int i, j, n, nmatches = 0, ov[3], rc;
> int igcase_opt, value_opt, perl_opt, fixed_opt, useBytes, invert;
> const char *spat = NULL;
> pcre *re_pcre = NULL /* -Wall */;
> @@ -882,7 +882,7 @@
> if (fixed_opt)
> LOGICAL(ind)[i] = fgrep_one(spat, s, useBytes, use_UTF8, NULL) >= 0;
> else if (perl_opt) {
> - if (pcre_exec(re_pcre, re_pe, s, strlen(s), 0, 0, &ov, 0) >= 0)
> + if (pcre_exec(re_pcre, re_pe, s, strlen(s), 0, 0, ov, 3) >= 0)
> INTEGER(ind)[i] = 1;
> } else {
> if (!use_WC)
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>


R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 30 Sep 2011 - 15:06:53 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 30 Sep 2011 - 16:20:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive