[Rd] the incredible lightness of crossprod

From: Patrick Burns <pburns_at_pburns.seanet.com>
Date: Fri 28 Jan 2005 - 02:25:30 EST

The following is at least as much out of intellectual curiosity as for practical reasons.

On reviewing some code written by novices to R, I came across:

crossprod(x, y)[1,1]

I thought, "That isn't a very S way of saying that, I wonder what the penalty is for using 'crossprod'." To my surprise the penalty was substantially negative. Handily the client had S-PLUS as well -- there the sign of the penalty was as I had expected, but the order of magnitude was off.

Here are the timings of 1 million computations on vectors of length 1000. This is under Windows, R version 1.9.1 and S-PLUS 6.2 (on the same machine).

Command                               R                        S-PLUS
sum(x * y)                              28.61                        97.6
crossprod(x, y)[1,1]                 6.77                     2256.2

Another example is when computing the sums of the columns of a matrix. For example:

jjm <- matrix(rnorm(600), 5)

Timings for this under Windows 2000 with R version 2.0.1 (on an old chip running at about 0.7Ghz) for 100,000 computations are:

apply(jjm, 2, sum)               536.59
colSums(jjm)                         18.26
rep(1,5) %*% jjm                 15.41
crossprod(rep(1,5), jjm)        13.16

(These timings seem to be stable across R versions and on at least one Linux platform.)

Andy Liaw showed another example of 'crossprod' being fast a couple days ago on R-help.

Questions for those with a more global picture of the code:

Patrick Burns

Burns Statistics
+44 (0)20 8525 0696
(home of S Poetry and "A Guide for the Unwilling S User")

R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Jan 28 02:02:48 2005

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:02:42 EST