From: Patrick Burns <pburns_at_pburns.seanet.com>

Date: Fri 28 Jan 2005 - 02:25:30 EST

R-devel@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Jan 28 02:02:48 2005

Date: Fri 28 Jan 2005 - 02:25:30 EST

The following is at least as much out of intellectual curiosity
as for practical reasons.

On reviewing some code written by novices to R, I came across:

Command R S-PLUS sum(x * y) 28.61 97.6 crossprod(x, y)[1,1] 6.77 2256.2

Another example is when computing the sums of the columns of a matrix. For example:

set.seed(1)

jjm <- matrix(rnorm(600), 5)

Timings for this under Windows 2000 with R version 2.0.1 (on an old chip running at about 0.7Ghz) for 100,000 computations are:

apply(jjm, 2, sum) 536.59 colSums(jjm) 18.26 rep(1,5) %*% jjm 15.41 crossprod(rep(1,5), jjm) 13.16

(These timings seem to be stable across R versions and on at least one Linux platform.)

Andy Liaw showed another example of 'crossprod' being fast a couple days ago on R-help.

Questions for those with a more global picture of the code:

- Is the speed advantage of 'crossprod' inherent, or is it because more care has been taken with its implementation than the other functions?
- Is 'crossprod' faster than 'sum(x * y)' because 'crossprod' is going to BLAS while 'sum' can't?
- Would it make sense to (essentially) use 'crossprod' in 'colSums' and its friends at least for the special case of matrices?

Patrick Burns

Burns Statistics

patrick@burns-stat.com

+44 (0)20 8525 0696

http://www.burns-stat.com

(home of S Poetry and "A Guide for the Unwilling S User")

R-devel@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Jan 28 02:02:48 2005

*
This archive was generated by hypermail 2.1.8
: Fri 18 Mar 2005 - 09:02:42 EST
*