Re: [R] cumulative sum of within levels of a dataframe

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Fri, 27 Jun 2008 22:09:35 +0100

On Fri, 2008-06-27 at 16:52 -0400, Levi Waldron wrote:
> This one should be easy but it's giving me a hard time mostly because tapply
> puts the results in a list. I want to calculate the cumulative sum of a
> variable in a dataframe, but with the accumulation only within each level of
> a factor. For a very simple example, take:

> df$willdo <- unlist(tapply(df$x, df$fac, cumsum))
> df$ideal <- df$willdo - df$x
> df

   x fac willdo ideal

1  1   a      1     0
2  1   a      2     1
3  1   a      3     2
4  1   a      4     3
5  1   a      5     4
6  2   b      2     0
7  2   b      4     2
8  2   b      6     4
9  2   b      8     6
10 2   b     10     8
11 3   c      3     0
12 3   c      6     3
13 3   c      9     6
14 3   c     12     9
15 3   c     15    12

HTH G

>
> > df <-
> data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3]))
> > df
> x fac
> 1 1 a
> 2 1 a
> 3 1 a
> 4 1 a
> 5 1 a
> 6 2 b
> 7 2 b
> 8 2 b
> 9 2 b
> 10 2 b
> 11 3 c
> 12 3 c
> 13 3 c
> 14 3 c
> 15 3 c
>
> I'd like to create another column in the dataframe so it looks like this,
> and make sure that the cumulative sums still match the right levels of the
> factor. I've included a "willdo" column that's just a cumulative sum, and
> an "ideal" column that's the cumulative sum minus the current value - the
> column headings are self explanatory.
>
> > answer
> x fac willdo ideal
> 1 1 a 1 0
> 2 1 a 2 1
> 3 1 a 3 2
> 4 1 a 4 3
> 5 1 a 5 4
> 6 2 b 2 0
> 7 2 b 4 2
> 8 2 b 6 4
> 9 2 b 8 6
> 10 2 b 10 8
> 11 3 c 3 0
> 12 3 c 6 3
> 13 3 c 9 6
> 14 3 c 12 9
> 15 3 c 15 12
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
>
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 27 Jun 2008 - 21:12:14 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 27 Jun 2008 - 21:31:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive