Re: [R] Creating a dataframe from a vector of character strings

From: Brian Diggs <diggsb_at_ohsu.edu>
Date: Thu, 14 Apr 2011 15:55:58 -0700

On 4/14/2011 2:04 PM, Cliff Clive wrote:
> I have a vector of character strings that I would like to split in two, and
> place in columns of a dataframe.
>
> So for example, I start with this:
>
> beatles<- c("John Lennon", "Paul McCartney", "George Harrison", "Ringo
> Starr")
>
> and I want to end up with a data frame that looks like this:
>
>> Beatles = data.frame(firstName=c("John", "Paul", "George", "Ringo"),
> lastName=c("Lennon", "McCartney", "Harrison",
> "Starr"))
>> Beatles
> firstName lastName
> 1 John Lennon
> 2 Paul McCartney
> 3 George Harrison
> 4 Ringo Starr
>
>
> I tried string-splitting the first vector on the spaces between first and
> last names, and it returned a list:
>
>> strsplit(beatles, " ")
> [[1]]
> [1] "John" "Lennon"
>
> [[2]]
> [1] "Paul" "McCartney"
>
> [[3]]
> [1] "George" "Harrison"
>
> [[4]]
> [1] "Ringo" "Starr"
>
>
> Is there a fast way to convert this list into a data frame? Right now all I
> can think of is using a for loop, which I would like to avoid, since the
> real application I am working on involves a much larger dataset.

Another approach, in addition to the ones you have already been given, is to use the colsplit function in the reshape package. This is the sort of thing it is designed to do.

library("reshape")
colsplit(beatles, " ", names=c("firstName", "lastName"))

Similar caveats apply, though, in that it assumes only 2 names that are separated by one space (and will give a warning if that is not the case).

-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 14 Apr 2011 - 23:02:03 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Apr 2011 - 23:40:31 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive