Re: [R] How to split a factor (unique identifier) into severalothers?

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Thu, 7 Feb 2008 12:02:38 -0700

The essence of do.call is to call the named function (rbind in this case) with the elements of the list as it's arguments.

In this case with a list without named elements the following:

> do.call('myfunction',mylist)

Is equivalent to

> myfuncion( mylist[[1]], mylist[[2]], mylist[[3]], ..., mylist[[n]] )

With the ... Replaced by however many additional elements are there (you can see how it can save lots of typing).

So using rbind, it just rbinds together the elements of the list, or uses each element (the split from the original strings) as a row of a new object, in this case a matrix. The as.data.frame then converts the columns to factors.

Does this help the understanding?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow_at_imail.org
(801) 408-8111
 
 


> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of Tribo Laboy
> Sent: Thursday, February 07, 2008 2:33 AM
> To: Dimitris Rizopoulos
> Cc: r-help_at_r-project.org
> Subject: Re: [R] How to split a factor (unique identifier)
> into severalothers?
>
> Hi Dimitris,
>
>
> Your code works like charm, but I don't really understand
> how. If you have some time I'll appreciate if you can explain
> some more.
>
> The contents of "vals" in your example is equivalent to the
> contents of "splitfctr" in mine.
>
> "as.data.frame" is quite clear, but "do.call("rbind", vals)"
> has me puzzled.
>
> I checked the "do.call" help, but I could not replicate the
> results on the command line by directly using "rbind".
>
> If I had to do it by directly using "rbind" can you show me
> how to do it?
>
>
> I really appreciate your help.
>
>
> In the meantime I came up with another solution, which is
> much more clunky than yours, but at least I can understand
> how it works. I am putting it here, just as an additional
> thing for the archives.
>
> after the "splitfctr" ( or "vals" in Dimitris example is obtained)
>
> I use the "unlist" function on the list and then make new
> factors like that:
>
> all_fctrs <- unlist(splitfctr)
> sample_fctr <- factor(all_fctrs[seq(1, length(all_fctrs),
> 3)]) condition_fctr <- factor(all_fctrs[seq(2,
> length(all_fctrs), 3)]) place_fctr <- factor(all_fctrs[seq(3,
> length(all_fctrs), 3)])
>
> then I bundle the factors into the data frame by "cbind".
>
>
> Thanks for the help.
>
> TL
>
>
>
> On Thu, Feb 7, 2008 at 5:20 PM, Dimitris Rizopoulos
> <dimitris.rizopoulos_at_med.kuleuven.be> wrote:
> > try the following:
> >
> > dat <- data.frame(x = c("sample1_condition1_place1",
> > "sample2_condition1_place1", "sample3_condition1_place1",
> > "sample1_condition2_place1", "sample1_condition2_place1"))
> >
> > vals <- strsplit(as.character(dat$x), "_")
> > as.data.frame(do.call("rbind", vals))
> >
> >
> > I hope it helps.
> >
> > Best,
> > Dimitris
> >
> > ----
> > Dimitris Rizopoulos
> > Ph.D. Student
> > Biostatistical Centre
> > School of Public Health
> > Catholic University of Leuven
> >
> > Address: Kapucijnenvoer 35, Leuven, Belgium
> > Tel: +32/(0)16/336899
> > Fax: +32/(0)16/337015
> > Web: http://med.kuleuven.be/biostat/
> > http://www.student.kuleuven.be/~m0390867/dimitris.htm
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Tribo Laboy" <tribolaboy_at_gmail.com>
> > To: <r-help_at_r-project.org>
> > Sent: Thursday, February 07, 2008 7:44 AM
> > Subject: [R] How to split a factor (unique identifier)
> into several
> > others?
> >
> >
> > > Hello,
> > >
> > > I have a data frame with a factor column, which uniquely
> identifies
> > > the observations in the data frame and it looks like this:
> > >
> > > sample1_condition1_place1
> > > sample2_condition1_place1
> > > sample3_condition1_place1
> > > .
> > > .
> > > .
> > > sample3_condition3_place3
> > >
> > > I want to turn it into three separate factor columns
> "sample", >
> > "condition" and "place".
> > >
> > > This is what I did so far:
> > >
> > > # generate a factor column for the example > fctr<-
> > factor(c("sample1_condition1_place1",
> > > "sample2_condition1_place1", "sample3_condition1_place1")) >
> > splitfctr <- strsplit(as.character(fctr),"_") > >> splitfctr >
> > [[1]]
> > > [1] "sample1" "condition1" "place1"
> > >
> > > [[2]]
> > > [1] "sample2" "condition1" "place1"
> > >
> > > [[3]]
> > > [1] "sample3" "condition1" "place1"
> > >
> > >
> > > Now this is all fine, but how do I make three separate
> factors of
> > > this?
> > > The object "splitfctr" is a list of character vectors, each >
> > character > vector being composed of the words after spitting the
> > long original > world.
> > > Now I want to form new character vectors, which contain
> the first
> > > component of each list entry, then another vector for the
> second >
> > component, etc.
> > > I don't want to use loops, unless that's the only way to
> do it.I >
> > guess > I have some difficulty with understanding how R indexing
> > works...
> > >
> > > ______________________________________________
> > > R-help_at_r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> reproducible code.
> > >
> >
> >
> > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> >
> >
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Thu 07 Feb 2008 - 19:12:07 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 08 Feb 2008 - 15:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive