Re: [R] split character vector by multiple keywords simultaneously

From: Andrew Robinson <A.Robinson_at_ms.unimelb.edu.au>
Date: Thu, 05 May 2011 12:22:16 +1000

A hack would be to use gsub() to prepend e.g. XXX to the keywords that you want, perform a strsplit() to break the lines into component strings, and then substr() to extract the pieces that you want from those strings.

Cheers

Andrew

On Wed, May 04, 2011 at 04:08:40PM -0700, sunny wrote:
> Hi. I have a character vector that looks like this:
>
> > temp <- c("Company name: The first company General Manager: John Doe I
> > Managers: John Doe II, John Doe III","Company name: The second company
> > General Manager: Jane Doe I","Company name: The third company Managers:
> > Jane Doe II, Jane Doe III")
> > temp
> [1] "Company name: The first company General Manager: John Doe I Managers:
> John Doe II, John Doe III"
> [2] "Company name: The second company General Manager: Jane Doe I"
> [3] "Company name: The third company Managers: Jane Doe II, Jane Doe III"
>
> I know all the keywords, i.e. "Company name:", "General Manager:",
> "Managers:" etc. I'm looking for a way to split this character vector into
> multiple character vectors, with one column for each keyword and the
> corresponding values for each, i.e.
>
> Company name General Manager Managers
> 1 The first company John Doe I John Doe II, John
> Doe III
> 2 The second company Jane Doe I
> 3 The third company Jane Doe II,
> Jane Doe III
>
> I have tried a lot to find something suitable but haven't so far. Any help
> will be greatly appreciated. I am running R-2.12.1 on x86_64 linux.
>
> Thanks.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/split-character-vector-by-multiple-keywords-simultaneously-tp3497033p3497033.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Program Manager, ACERA 
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 05 May 2011 - 06:25:13 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 08 May 2011 - 12:30:05 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive