Re: [R] reshape

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Sun, 10 Feb 2008 18:31:48 -0500

This isn't really well defined. Suppose we have two rows that both have a, a2 and a value for B. Now suppose we have another row with a,a2 but with a value for C. Does the third row go with the first one? the second one? a new row? both the first and the second?

Here is one possibility but without a good definition of the problem we don't know whether its answering the problem that is intended.

In the code below we assume that all dat rows that have the same sp value and the same code value are adjacent and if a tr occurs among those dat rows that is equal to or less than the prior row in factor level order then the new dat row must start a new output row else not. Thus within an sp/code group we assign each row a 1 until we get a tr that is less than the prior row's tr and then we start assigning 2 and so on. This is the new column seq below. We then use seq as part of our id.var in reshape. For the particular example in your post this does give the same answer.

f <- function(x) cumsum(c(1, diff(x) <= 0)) dat$seq <- ave(as.numeric(dat$tr), dat$sp, dat$code, FUN = f) reshape(dat[-1], direction="wide", timevar="tr", idvar=c("code","sp","seq" ))[-3]

On Feb 10, 2008 4:58 PM, juli pausas <pausas_at_gmail.com> wrote:
> Dear colleagues,
> I'd like to reshape a datafame in a long format to a wide format, but
> I do not quite get what I want. Here is an example of the data I've
> have (dat):
>
> sp <- c("a", "a", "a", "a", "b", "b", "b", "c", "d", "d", "d", "d")
> tr <- c("A", "B", "B", "C", "A", "B", "C", "A", "A", "B", "C", "C")
> code <- c("a1", "a2", "a2", "a3", "a3", "a3", "a4", "a4", "a4", "a5",
> "a5", "a6")
> dat <- data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code)

>
> and below is what I'd like to obtain. That is, I'd like the tr
> variable in different columns (as a timevar) with their value (val).
>
> sp code tr.A tr.B tr.C
> a a1 31 NA NA
> a a2 NA 32 NA
> a a2 NA 33 NA **
> a a3 NA NA 34
> b a3 35 36 NA
> b a4 NA NA 37
> c a4 38 NA NA
> d a4 39 NA NA
> d a5 NA 40 41
> d a6 NA NA 42
>
> Using reshape:
>
> reshape(dat[,2:5], direction="wide", timevar="tr", idvar=c("code","sp" ))

>
> I'm getting very close. The only difference is in the 3rd row (**),
> that is when sp and code are the same I only get one record. Is there
> a way to get all records? Any idea?

>
> Thank you very much for any help
>
> Juli Pausas
>
> --
> http://www.ceam.es/pausas
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 10 Feb 2008 - 23:37:17 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Feb 2008 - 01:30:13 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive