Re: [R] strsplit, keeping delimiters

From: hadley wickham <h.wickham_at_gmail.com>
Date: Sat, 14 Jun 2008 10:46:10 -0500

On Sat, Jun 14, 2008 at 10:20 AM, Martin Morgan <mtmorgan_at_fhcrc.org> wrote:
> "hadley wickham" <h.wickham@gmail.com> writes:
> n

>> On Sat, Jun 14, 2008 at 12:55 AM, Gabor Grothendieck
>> <ggrothendieck_at_gmail.com> wrote:
>>> Try this:
>>>
>>>> library(gsubfn)
>>>> x <- "A: 123 B: 456 C: 678"
>>>> strapply(x, "[^ :]+[ :]|[^ :]+$")
>>> [[1]]
>>> [1] "A:"   "123 " "B:"   "456 " "C:"   "678"
>

> Also
>
>> strsplit(x, "(?<=[0-9:] )", perl=TRUE)

> [[1]]
> [1] "A: " "123 " "B: " "456 " "C: " "678"
>

> which uses perl's zero-length lookbehind to match "" preceed by a
> digit or : and then a space. This is not quite what you asked for

My real example is actually a little more complicated

x <- "AC: 123 BDEF: 456 CADSDFSDFSF: 6sdf:78"

so the look-ahead approach doesn't work (and neither does a look-behind because it has to be fixed length).

>> I'd like to get
>
>> c("A:", "123 ", "B: ", "456 ", "C: ", 678)
>

> (no space after A:) or what Gabor offered (no spaces after :) but maybe
> what you intended?

Either way is fine, since I'll be stripping off the spaces later anyway.

Hadley

-- 
http://had.co.nz/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat 14 Jun 2008 - 16:13:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 14 Jun 2008 - 17:30:41 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive