From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Sat 18 Jun 2005 - 13:18:38 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jun 18 13:21:44 2005

Date: Sat 18 Jun 2005 - 13:18:38 EST

On 6/17/05, alex diaz <celebridades@megamail.pt> wrote:

> Dear all:

*>
**> Here is my problem:
**>
**> Example data:
**> dat<-data.frame(x=rep(c("a","b","c","d"),2),y=c(10:17))
**>
**> If I wanted to aggregate each level of column dat$x I
**> could use:
**> aggregate(dat$y,list(x=dat$x),sum)
**>
**> But I just want to aggregate two levels ("c" and "d")
**> to obtain a new level "e"
**> I am expecting something like:
**>
**> x y
**> 1 a 10
**> 2 b 11
**> 3 e 25
**> 4 a 14
**> 5 b 15
**> 6 e 33
*

In the example

- dat$y[3:4] are summed and

- dat$y[7:8] are summed

so we assume that what is being requested is that "d" is to
be replaced by "c" and runs of any level are to be summed.

To do that:

- create xx such that a, b, c and d in dat$x are replaced with
with 1, 2, 3 and 3 in xx.

- in the second statement calculate a running sum except if the
last observation was the same as the current observation then
the Last Observation is Carried Forward (locf) so that all entries
in a run have the same number. e.g. in this case locf is
c(1, 2, 3, 3, 4, 5, 6, 6)

- Finally the 'by' collapses dat using locf rbinds the
resulting rows together to create a data frame.

xx <- ifelse(dat$x == "d", 3, dat$x)

locf <- cumsum(c(TRUE, xx[-1] != xx[-length(xx)]))
f <- function(x) data.frame(x=x[1,1], y=sum(x[,2]))
dat2 <- do.call("rbind", by(dat, locf, f))

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat Jun 18 13:21:44 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:32:50 EST
*