From: Duncan Murdoch

Date: Sat 06 Aug 2005 - 02:55:06 EST

Hi,
*

*>>>
*

I have a 5x731 array A, and I want to compute the sums of the columns.

Currently I do:
*

*>>>
*

apply(A, 2, sum)
*

*>>>
*

But it turns out, this is slow: 70% of my CPU time is spent here, even
*

though there are many complicated steps in my computation.
*

*>>>
*

Is there a faster way?
*

On 8/5/2005 12:43 PM, Uwe Ligges wrote:

> Duncan Murdoch wrote: > >> On 8/5/2005 12:16 PM, Martin C. Martin wrote: >>

>>>I have a 5x731 array A, and I want to compute the sums of the columns.

>> >> >> You'd probably do better with matrix multiplication: >> >> rep(1, nrow(A)) %*% A > > > No, better use colSums(), which has been optimized for this purpose: > > A <- matrix(seq(1, 10000000), ncol=10000) > system.time(colSums(A)) > # ~ 0.1 sec. > system.time(rep(1, nrow(A)) %*% A) > # ~ 0.5 sec.

I didn't claim my solution was the best, only better. :-)

One point of interest: I think your example exaggerates the difference by using a matrix of integers. On my machine I get a ratio something like yours with the same example

> A <- matrix(seq(1, 10000000), ncol=10000)
> system.time(colSums(A))

[1] 0.08 0.00 0.08 NA NA

> system.time(rep(1, nrow(A)) %*% A)

[1] 0.25 0.01 0.23 NA NA

but if I make A floating point, there's much less difference:

> A <- matrix(as.numeric(seq(1, 10000000)), ncol=10000)
> system.time(colSums(A))

[1] 0.09 0.00 0.09 NA NA

> system.time(rep(1, nrow(A)) %*% A)

[1] 0.11 0.00 0.12 NA NA

Still, colSums is the winner in both cases.

Duncan Murdoch

