[R] cumulative sum of within levels of a dataframe

From: Levi Waldron <leviwaldron_at_gmail.com>
Date: Fri, 27 Jun 2008 16:52:41 -0400


This one should be easy but it's giving me a hard time mostly because tapply puts the results in a list. I want to calculate the cumulative sum of a variable in a dataframe, but with the accumulation only within each level of a factor. For a very simple example, take:

> df <-

data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3]))
> df

   x fac
1 1 a
2 1 a
3 1 a
4 1 a
5 1 a
6 2 b
7 2 b
8 2 b
9 2 b
10 2 b
11 3 c
12 3 c
13 3 c
14 3 c
15 3 c

I'd like to create another column in the dataframe so it looks like this, and make sure that the cumulative sums still match the right levels of the factor. I've included a "willdo" column that's just a cumulative sum, and an "ideal" column that's the cumulative sum minus the current value - the column headings are self explanatory.

> answer

   x fac willdo ideal

1  1   a      1     0
2  1   a      2     1
3  1   a      3     2
4  1   a      4     3
5  1   a      5     4
6  2   b      2     0
7  2   b      4     2
8  2   b      6     4
9  2   b      8     6
10 2   b     10     8
11 3   c      3     0
12 3   c      6     3
13 3   c      9     6
14 3   c     12     9
15 3   c     15    12

	[[alternative HTML version deleted]]

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 27 Jun 2008 - 20:56:42 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 27 Jun 2008 - 23:31:43 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive