From: jim holtman <jholtman_at_gmail.com>

Date: Sun, 20 Jul 2008 21:21:35 -0400

Date: Sun, 20 Jul 2008 21:21:35 -0400

Does this do what you want:

*> # following up on another idea that was presented
**> # where are the breaks
**> dataBreaks <- cumsum(c(0, (diff(x[, 2] + x[, 1] * max(x[, 2])) != 0)))
**> # sum up column 3 and output the first two columns with the indices
*

> result <- lapply(split(seq(nrow(x)), dataBreaks), function(.sect){

+ c(x[.sect[1], 1:2], sum(x[.sect, 3]))
+ })

> do.call(rbind, result)

[,1] [,2] [,3]

0 1 7 3

1 2 4 2

2 3 2 3

3 1 7 10

On Sun, Jul 20, 2008 at 7:57 PM, Ralph S. <ruffel1_at_hotmail.com> wrote:

*>
*

> The first and second column are actually indices of another matrix (my example may make this not sufficiently clear). I want to compare the sum with that corresponding entry, and then record the result of that.

*>
**> Any idea?
**>
**> Best,
**>
**> Ralph
**>
**>
**>
**> ----------------------------------------
**>> Date: Sun, 20 Jul 2008 16:50:41 -0700
**>> From: h.wickham_at_gmail.com
**>> To: ruffel1_at_hotmail.com
**>> Subject: Re: [R] Sum efficiently from large matrix according to re-occuring levels of factor?
**>> CC: r-help_at_r-project.org
**>>
**>> On Sun, Jul 20, 2008 at 4:47 PM, hadley wickham wrote:
**>>> On Sun, Jul 20, 2008 at 4:16 PM, Ralph S. wrote:
**>>>>
**>>>> Hi,
**>>>>
**>>>> I am trying to calculate the sum for each occurrence of the level of a factor in a very large matrix. In addition, I want to save that sum together with the information of the level of the factor and the level of a second factor.
**>>>>
**>>>> My matrix looks like this:
**>>>>
**>>>> x<-matrix(c(1,1,1,2,2,3,3,1,1,7,7,7,4,4,2,2,7,7,1,1,1,1,1,1,2,5,5),9,3)
**>>>>
**>>>> I want to sum according to the levels in the first column and save the sum with the information of the level in the first and the second column in a new matrix.
**>>>>
**>>>> That is, I want output in the matrix of form:
**>>>>
**>>>> 1 7 3
**>>>> 2 4 2
**>>>> 3 2 3
**>>>> 1 7 10
**>>>>
**>>>
**>>> Why that and not:
**>>>
**>>> 1 7 13
**>>> 2 4 2
**>>> 3 2 3
**>>>
**>>> ?
**>>
**>> Here's a solution for that case:
**>>
**>> index <- x[, 2] + x[, 1] * max(x[, 2])
**>> cbind(x[!duplicated(index), 1:2], tapply(x[, 3], index, sum))
**>>
**>> It takes about half a second for a million row matrix.
**>>
**>> Hadley
**>>
**>>
**>>
**>> --
**>> http://had.co.nz/
**>
**> _________________________________________________________________
**> With Windows Live for mobile, your contacts travel with you.
**>
**> 072008
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Mon 21 Jul 2008 - 01:37:32 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 21 Jul 2008 - 03:32:29 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*