Re: [Rd] invert argument in grep

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Sun 12 Nov 2006 - 16:02:15 GMT

invert= would be consistent with the fact that egrep (-v), sed/vi (v) and awk (~!) all have special facilities as indicated to handle such negation/inversion.

On 11/12/06, Romain Francois <rfrancois@mango-solutions.com> wrote:
> Duncan Murdoch wrote:
> > On 11/10/2006 12:52 PM, Romain Francois wrote:
> >> Duncan Murdoch wrote:
> >>> On 11/9/2006 5:14 AM, Romain Francois wrote:
> >>>> Hello,
> >>>>
> >>>> What about an `invert` argument in grep, to return elements that
> >>>> are *not* matching a regular expression :
> >>>>
> >>>> R> grep("pink", colors(), invert = TRUE, value = TRUE)
> >>>>
> >>>> would essentially return the same as :
> >>>>
> >>>> R> colors() [ - grep("pink", colors()) ]
> >>>>
> >>>>
> >>>> I'm attaching the files that I modified (against today's tarball)
> >>>> for that purpose.
> >>>
> >>> I think a more generally useful change would be to be able to return
> >>> a logical vector with TRUE for a match and FALSE for a non-match, so
> >>> a simple !grep(...) does the inversion. (This is motivated by the
> >>> recent R-help discussion of the fact that x[-selection] doesn't
> >>> always invert the selection when it's a vector of indices.)
> >>>
> >>> A way to do that without expanding the argument list would be to allow
> >>>
> >>> value="logical"
> >>>
> >>> as well as value=TRUE and value=FALSE.
> >>>
> >>> This would make boolean operations easy, e.g.
> >>>
> >>> colors()[grep("dark", colors(), value="logical")
> >>> & !grep("blue", colors(), value="logical")]
> >>>
> >>> to select the colors that contain "dark" but not "blue". (In this
> >>> case the RE to select that subset is rather simple because "dark"
> >>> always precedes "blue", but if that wasn't true, it would be a lot
> >>> messier.)
> >>>
> >>> Duncan Murdoch
> >> Hi,
> >>
> >> It sounds like a nice thing to have. I would still prefer to type :
> >>
> >> R> grep ( "dark", grep("blue", colors(), value = TRUE, invert=TRUE),
> >> value = TRUE )
> >
> > That's good for intersecting two searches, but not for other boolean
> > combinations.
> >
> > My main point was that inversion isn't the only boolean operation you
> > may want, but R has perfectly good powerful boolean operators, so
> > installing a limited subset of boolean algebra into grep() is probably
> > the wrong approach.

>

> Hi,
>

> Yes, good point. I agree with you that the value = "logical" is probably
> worth having to take advantage of these logical operators.
>

> .... but, what about all these functions calling grep and passing
> arguments through the ellipsis. With this invert argument, we could do :
>

> R> history(pattern = "grid\\..*\\(", invert = TRUE)
>

> BTW, why not use ... in ls ? in case someone would like to use perl
> regex to use ls, or to get back at this thread, issue commands like :
>

> R> ls("package:grid", pattern = "^grid\\.|Grob$", invert = TRUE)
> [1] "absolute.size" "applyEdit" "applyEdits"
> [4] "arcCurvature" "arrow" "childNames"
> [7] "convertHeight" "convertNative" "convertUnit"
> [10] "convertWidth" "convertX" "convertY"
> [13] "current.transform" "current.viewport" "current.vpPath"
> [16] "current.vpTree" "dataViewport" "downViewport"
> [19] "draw.details" "drawDetails" "editDetails"
> [22] "engine.display.list" "gEdit" "gEditList"
> [25] "get.gpar" "getNames" "gList"
> [28] "gpar" "gPath" "grob"
> [31] "grobHeight" "grobName" "grobWidth"
> [34] "grobX" "grobY" "gTree"
> [37] "heightDetails" "is.unit" "layout.heights"
> [40] "layoutRegion" "layout.torture" "layout.widths"
> [43] "plotViewport" "pop.viewport" "popViewport"
> [46] "postDrawDetails" "preDrawDetails" "push.viewport"
> [49] "pushViewport" "seekViewport" "setChildren"
> [52] "stringHeight" "stringWidth" "unit"
> [55] "unit.c" "unit.length" "unit.pmax"
> [58] "unit.pmin" "unit.rep" "upViewport"
> [61] "validDetails" "viewport" "viewport.layout"
> [64] "viewport.transform" "vpList" "vpPath"
> [67] "vpStack" "vpTree" "widthDetails"
> [70] "xDetails" "yDetails"
>

> Then, what about ... in apropos ?
>

> Regards,
>

> Romain
>
>

> >>
> >>
> >> What about a way to pass more than one regular expression then be
> >> able to call :
> >>
> >> R> grep( c("dark", "blue"), colors(), value = TRUE, invert = c(TRUE,
> >> FALSE)
> >
> > Again, it covers & and !, but it misses other boolean operators.
> >
> >> I usually use that kind of shortcuts that are easy to remember.
> >>
> >> vgrep <- function(...) grep(..., value = TRUE)
> >> igrep <- function(...) grep(..., invert = TRUE)
> >> ivgrep <- vigrep <- function(...) grep(..., invert = TRUE, value = TRUE)
> >
> > If you're willing to write these, then it's easy to write igrep
> > without an invert arg to grep:
> >
> > igrep <- function(pat, x, ...)
> > setdiff(1:length(x), grep(pat, x, value = FALSE, ...))
> >
> > ivgrep would also be easy, except for the weird semantics of
> > value=TRUE pointed out by Brian: but it could still be written with a
> > little bit of care.
> >
> > Duncan Murdoch
> >
> >>
> >> What about things like the arguments `after` and `before` in unix
> >> grep. That could be used when grepping inside a function :
> >>
> >> R> grep("plot\\.", body(plot.default) , value= TRUE)
> >> [1] "localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)"
> >> [2] "plot.new()"
> >> [3] "plot.xy(xy, type, ...)"
> >>
> >>
> >> when this could be useful (possibly).
> >>
> >> R> # grep("plot\\.", plot.default, after = 2, value = TRUE)
> >> R> tmp <- tempfile(); sink(tmp) ; print(body(plot.default)); sink();
> >> system( paste( "grep -A2 plot\\. ", tmp) )
> >> localWindow <- function(..., col, bg, pch, cex, lty, lwd)
> >> plot.window(...)
> >> localTitle <- function(..., col, bg, pch, cex, lty, lwd) title(...)
> >> xlabel <- if (!missing(x))
> >> --
> >> plot.new()
> >> localWindow(xlim, ylim, log, asp, ...)
> >> panel.first
> >> plot.xy(xy, type, ...)
> >> panel.last
> >> if (axes) {
> >> --
> >> if (frame.plot)
> >> localBox(...)
> >> if (ann)
> >>
> >>
> >> BTW, if I call :
> >>
> >> R> grep("plot\\.", plot.default)
> >> Error in as.character(x) : cannot coerce to vector
> >>
> >> What about adding that line at the beginning of grep, or something
> >> else to be able to do as.character on a function ?
> >>
> >> if(is.function(x)) x <- body(x)
> >>
> >>
> >> Cheers,
> >>
> >> Romain
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Romain
> >>
> >>
> >
> >
>
>

> --
> *mangosolutions*
> /data analysis that delivers/
>

> Tel +44 1249 467 467
> Fax +44 1249 467 468
>

> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon Nov 13 05:16:00 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Mon 13 Nov 2006 - 05:30:37 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.