Re: [R] How to split a factor (unique identifier) into several others?

From: Tribo Laboy <tribolaboy_at_gmail.com>
Date: Thu, 7 Feb 2008 18:33:19 +0900

Hi Dimitris,

Your code works like charm, but I don't really understand how. If you have some time I'll appreciate if you can explain some more.

The contents of "vals" in your example is equivalent to the contents of "splitfctr" in mine.

"as.data.frame" is quite clear, but "do.call("rbind", vals)" has me puzzled.

I checked the "do.call" help, but I could not replicate the results on the command line by directly using "rbind".

If I had to do it by directly using "rbind" can you show me how to do it?

I really appreciate your help.

In the meantime I came up with another solution, which is much more clunky than yours, but at least I can understand how it works. I am putting it here, just as an additional thing for the archives.

after the "splitfctr" ( or "vals" in Dimitris example is obtained)

I use the "unlist" function on the list and then make new factors like that:

all_fctrs <- unlist(splitfctr)
sample_fctr <- factor(all_fctrs[seq(1, length(all_fctrs), 3)]) condition_fctr <- factor(all_fctrs[seq(2, length(all_fctrs), 3)]) place_fctr <- factor(all_fctrs[seq(3, length(all_fctrs), 3)])

then I bundle the factors into the data frame by "cbind".

Thanks for the help.

TL

On Thu, Feb 7, 2008 at 5:20 PM, Dimitris Rizopoulos <dimitris.rizopoulos_at_med.kuleuven.be> wrote:
> try the following:
>
> dat <- data.frame(x = c("sample1_condition1_place1",
> "sample2_condition1_place1", "sample3_condition1_place1",
> "sample1_condition2_place1", "sample1_condition2_place1"))
>
> vals <- strsplit(as.character(dat$x), "_")
> as.data.frame(do.call("rbind", vals))
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
> ----
> Dimitris Rizopoulos
> Ph.D. Student
> Biostatistical Centre
> School of Public Health
> Catholic University of Leuven
>
> Address: Kapucijnenvoer 35, Leuven, Belgium
> Tel: +32/(0)16/336899
> Fax: +32/(0)16/337015
> Web: http://med.kuleuven.be/biostat/
> http://www.student.kuleuven.be/~m0390867/dimitris.htm
>
>
>
>
> ----- Original Message -----
> From: "Tribo Laboy" <tribolaboy_at_gmail.com>
> To: <r-help_at_r-project.org>
> Sent: Thursday, February 07, 2008 7:44 AM
> Subject: [R] How to split a factor (unique identifier) into several
> others?
>
>
> > Hello,
> >
> > I have a data frame with a factor column, which uniquely identifies
> > the observations in the data frame and it looks like this:
> >
> > sample1_condition1_place1
> > sample2_condition1_place1
> > sample3_condition1_place1
> > .
> > .
> > .
> > sample3_condition3_place3
> >
> > I want to turn it into three separate factor columns "sample",
> > "condition" and "place".
> >
> > This is what I did so far:
> >
> > # generate a factor column for the example
> > fctr<- factor(c("sample1_condition1_place1",
> > "sample2_condition1_place1", "sample3_condition1_place1"))
> > splitfctr <- strsplit(as.character(fctr),"_")
> >
> >> splitfctr
> > [[1]]
> > [1] "sample1" "condition1" "place1"
> >
> > [[2]]
> > [1] "sample2" "condition1" "place1"
> >
> > [[3]]
> > [1] "sample3" "condition1" "place1"
> >
> >
> > Now this is all fine, but how do I make three separate factors of
> > this?
> > The object "splitfctr" is a list of character vectors, each
> > character
> > vector being composed of the words after spitting the long original
> > world.
> > Now I want to form new character vectors, which contain the first
> > component of each list entry, then another vector for the second
> > component, etc.
> > I don't want to use loops, unless that's the only way to do it.I
> > guess
> > I have some difficulty with understanding how R indexing works...
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Feb 2008 - 09:36:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 07 Feb 2008 - 19:30:15 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive