Re: [Rd] extending strsplit(): supply pattern to keep, not to split by

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Tue 04 Apr 2006 - 16:01:42 GMT

gsubfn in package gsubfn can do this. See the examples in ?gsubfn

On 4/4/06, Bill Dunlap <bill@insightful.com> wrote:
> strsplit() is a convenient way to get a
> list of items from a string when you
> have a regular expression for what is not
> an item. E.g.,
>

> > strsplit("1.2, 34, 1.7e-2", split="[ ,] *")

> [[1]]:
> [1] "1.2" "34" "1.7e-2"
>
> However, sometimes is it more convenient to
> give a pattern for the items you do want.
> E.g., suppose you want to pull all the numbers
> out of a string which contains a mix of numbers
> and words. Making a pattern for what a
> number is simpler than making a pattern
> for what may come between the number.
> > number.pattern <- "[-+]?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))([eE][+-]?[0-9]+)?"

>
> I propose adding a keep=FALSE argument to
> strsplit() to do this. If keep is FALSE,
> then the split argument matches the stuff to
> omit from the output; if keep is TRUE then
> split matches the stuff to put into the
> output. Then we could do the following to
> get a list of all the numbers in a string
> (done in a version of strsplit() I'm working on
> for S-PLUS):
>
> > strsplit("1.2, 34, 1.7e-2", split=number.pattern,keep=TRUE)
> [[1]]:
> [1] "1.2" "34" "1.7e-2"
>
> > strsplit("Ibuprofin 200mg", split=number.pattern,keep=TRUE)
> [[1]]:
> [1] "200"
>
> Is this a reasonable thing to want strsplit to do?
> Is this a reasonable parameterization of it?
>
> ----------------------------------------------------------------------------
> Bill Dunlap
> Insightful Corporation
> bill at insightful dot com
> 360-428-8146
>
> "All statements in this message represent the opinions of the author and do
> not necessarily reflect Insightful Corporation policy or position."
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed Apr 05 02:26:14 2006

This archive was generated by hypermail 2.1.8 : Tue 04 Apr 2006 - 18:16:47 GMT