Re: [R] How to split a factor (unique identifier) into severalothers?

From: Tribo Laboy <tribolaboy_at_gmail.com>
Date: Fri, 8 Feb 2008 14:33:58 +0000

Hi Greg,

The short example you gave cleared it up. I still have some issues with getting used to R indexing. I was desperately trying to do:

> zzz <- rbind(fctrs_list[1], fctrs_list[2])

and was getting:

> zzz

     [,1]
[1,] Character,3
[2,] Character,3

instead of the

> zzz <- rbind(fctrs_list[[1]], fctrs_list[[2]])
> zzz

     [,1] [,2] [,3]
[1,] "Sample1" "condition1" "place1"
[2,] "Sample1" "condition1" "place2"

Thanks for the help, both to you and to Dimitris.

Regards,
TL

On Thu, Feb 7, 2008 at 7:02 PM, Greg Snow <Greg.Snow_at_imail.org> wrote:
> The essence of do.call is to call the named function (rbind in this
> case) with the elements of the list as it's arguments.
>
> In this case with a list without named elements the following:
>
> > do.call('myfunction',mylist)
>
> Is equivalent to
>
> > myfuncion( mylist[[1]], mylist[[2]], mylist[[3]], ..., mylist[[n]] )
>
> With the ... Replaced by however many additional elements are there (you
> can see how it can save lots of typing).
>
> So using rbind, it just rbinds together the elements of the list, or
> uses each element (the split from the original strings) as a row of a
> new object, in this case a matrix. The as.data.frame then converts the
> columns to factors.
>
> Does this help the understanding?
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow_at_imail.org
> (801) 408-8111
>
>
>
>
>
> > -----Original Message-----
> > From: r-help-bounces_at_r-project.org
> > [mailto:r-help-bounces_at_r-project.org] On Behalf Of Tribo Laboy
> > Sent: Thursday, February 07, 2008 2:33 AM
> > To: Dimitris Rizopoulos
> > Cc: r-help_at_r-project.org
> > Subject: Re: [R] How to split a factor (unique identifier)
> > into severalothers?
> >
> > Hi Dimitris,
> >
> >
> > Your code works like charm, but I don't really understand
> > how. If you have some time I'll appreciate if you can explain
> > some more.
> >
> > The contents of "vals" in your example is equivalent to the
> > contents of "splitfctr" in mine.
> >
> > "as.data.frame" is quite clear, but "do.call("rbind", vals)"
> > has me puzzled.
> >
> > I checked the "do.call" help, but I could not replicate the
> > results on the command line by directly using "rbind".
> >
> > If I had to do it by directly using "rbind" can you show me
> > how to do it?
> >
> >
> > I really appreciate your help.
> >
> >
> > In the meantime I came up with another solution, which is
> > much more clunky than yours, but at least I can understand
> > how it works. I am putting it here, just as an additional
> > thing for the archives.
> >
> > after the "splitfctr" ( or "vals" in Dimitris example is obtained)
> >
> > I use the "unlist" function on the list and then make new
> > factors like that:
> >
> > all_fctrs <- unlist(splitfctr)
> > sample_fctr <- factor(all_fctrs[seq(1, length(all_fctrs),
> > 3)]) condition_fctr <- factor(all_fctrs[seq(2,
> > length(all_fctrs), 3)]) place_fctr <- factor(all_fctrs[seq(3,
> > length(all_fctrs), 3)])
> >
> > then I bundle the factors into the data frame by "cbind".
> >
> >
> > Thanks for the help.
> >
> > TL
> >
> >
> >
> > On Thu, Feb 7, 2008 at 5:20 PM, Dimitris Rizopoulos
> > <dimitris.rizopoulos_at_med.kuleuven.be> wrote:
> > > try the following:
> > >
> > > dat <- data.frame(x = c("sample1_condition1_place1",
> > > "sample2_condition1_place1", "sample3_condition1_place1",
> > > "sample1_condition2_place1", "sample1_condition2_place1"))
> > >
> > > vals <- strsplit(as.character(dat$x), "_")
> > > as.data.frame(do.call("rbind", vals))
> > >
> > >
> > > I hope it helps.
> > >
> > > Best,
> > > Dimitris
> > >
> > > ----
> > > Dimitris Rizopoulos
> > > Ph.D. Student
> > > Biostatistical Centre
> > > School of Public Health
> > > Catholic University of Leuven
> > >
> > > Address: Kapucijnenvoer 35, Leuven, Belgium
> > > Tel: +32/(0)16/336899
> > > Fax: +32/(0)16/337015
> > > Web: http://med.kuleuven.be/biostat/
> > > http://www.student.kuleuven.be/~m0390867/dimitris.htm
> > >
> > >
> > >
> > >
> > > ----- Original Message -----
> > > From: "Tribo Laboy" <tribolaboy_at_gmail.com>
> > > To: <r-help_at_r-project.org>
> > > Sent: Thursday, February 07, 2008 7:44 AM
> > > Subject: [R] How to split a factor (unique identifier)
> > into several
> > > others?
> > >
> > >
> > > > Hello,
> > > >
> > > > I have a data frame with a factor column, which uniquely
> > identifies
> > > > the observations in the data frame and it looks like this:
> > > >
> > > > sample1_condition1_place1
> > > > sample2_condition1_place1
> > > > sample3_condition1_place1
> > > > .
> > > > .
> > > > .
> > > > sample3_condition3_place3
> > > >
> > > > I want to turn it into three separate factor columns
> > "sample", >
> > > "condition" and "place".
> > > >
> > > > This is what I did so far:
> > > >
> > > > # generate a factor column for the example > fctr<-
> > > factor(c("sample1_condition1_place1",
> > > > "sample2_condition1_place1", "sample3_condition1_place1")) >
> > > splitfctr <- strsplit(as.character(fctr),"_") > >> splitfctr >
> > > [[1]]
> > > > [1] "sample1" "condition1" "place1"
> > > >
> > > > [[2]]
> > > > [1] "sample2" "condition1" "place1"
> > > >
> > > > [[3]]
> > > > [1] "sample3" "condition1" "place1"
> > > >
> > > >
> > > > Now this is all fine, but how do I make three separate
> > factors of
> > > > this?
> > > > The object "splitfctr" is a list of character vectors, each >
> > > character > vector being composed of the words after spitting the
> > > long original > world.
> > > > Now I want to form new character vectors, which contain
> > the first
> > > > component of each list entry, then another vector for the
> > second >
> > > component, etc.
> > > > I don't want to use loops, unless that's the only way to
> > do it.I >
> > > guess > I have some difficulty with understanding how R indexing
> > > works...
> > > >
> > > > ______________________________________________
> > > > R-help_at_r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained,
> > reproducible code.
> > > >
> > >
> > >
> > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> > >
> > >
> >
> > ______________________________________________
> > R-help_at_r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 08 Feb 2008 - 14:41:10 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 08 Feb 2008 - 15:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive