Subject: Re: [R] Sum efficiently from large matrix according to re-occuring levels of factor?
Does this do what you want:
+ })
[,1] [,2] [,3]

0 1 7 3
1 2 4 2
2 3 2 3
3 1 7 10
On Sun, Jul 20, 2008 at 7:57 PM, Ralph S. wrote:
Jim Holtman

Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
yes - thank you very much! slowly getting to the full power of R . . .

# following up on another idea that was presented
# where are the breaks
dataBreaks <- cumsum(c(0, (diff(x[, 2] + x[, 1] * max(x[, 2])) != 0)))
# sum up column 3 and output the first two columns with the indices
result <- lapply(split(seq(nrow(x)), dataBreaks), function(.sect){

+ c(x[.sect[1], 1:2], sum(x[.sect, 3]))

do.call(rbind, result)

[,1] [,2] [,3]

The first and second column are actually indices of another matrix (my example may make this not sufficiently clear). I want to compare the sum with that corresponding entry, and then record the result of that.

Any idea?

Best,

Ralph

----------------------------------------
Date: Sun, 20 Jul 2008 16:50:41 -0700
From: h.wickham_at_gmail.com
To: ruffel1_at_hotmail.com
Subject: Re: [R] Sum efficiently from large matrix according to re-occuring levels of factor?

On Sun, Jul 20, 2008 at 4:47 PM, hadley wickham wrote:
On Sun, Jul 20, 2008 at 4:16 PM, Ralph S. wrote:

Hi,

I am trying to calculate the sum for each occurrence of the level of a factor in a very large matrix. In addition, I want to save that sum together with the information of the level of the factor and the level of a second factor.

My matrix looks like this:

x<-matrix(c(1,1,1,2,2,3,3,1,1,7,7,7,4,4,2,2,7,7,1,1,1,1,1,1,2,5,5),9,3)

I want to sum according to the levels in the first column and save the sum with the information of the level in the first and the second column in a new matrix.

That is, I want output in the matrix of form:

1 7 3
2 4 2
3 2 3
1 7 10

Why that and not:

1 7 13
2 4 2
3 2 3

?

Here's a solution for that case:

index <- x[, 2] + x[, 1] * max(x[, 2])
cbind(x[!duplicated(index), 1:2], tapply(x[, 3], index, sum))

It takes about half a second for a million row matrix.

Hadley

--
http://had.co.nz/

