From: Prof Brian Ripley <ripley_at_stats.ox.ac.uk>

Date: Fri 28 Jan 2005 - 07:23:39 EST

Date: Fri 28 Jan 2005 - 07:23:39 EST

On Thu, 27 Jan 2005, Paul Gilbert wrote:

> A few weeks ago I noticed

*>
**>> z <- matrix(rnorm(20000),10000,2)
**>
**>> system.time(for (i in 1:1000) apply(z,2,sum))
**> [1] 13.44 0.48 14.08 0.00 0.00
**>
**>> system.time(for (i in 1:1000) rep(1,10000) %*% z)
**> [1] 6.46 0.11 6.84 0.00 0.00
*

So both run in a few milliseconds for rather large problems.

> which seemed completely contrary to all my childhood teachings. Now

*>
**>> system.time(for (i in 1:1000) crossprod(rep(1,10000), z))
**> [1] 1.90 0.12 2.24 0.00 0.00
**>
**> makes sense because it is suppose to be faster than %*% , but why is apply so
**> slow?
*

`so slow' sic: what are you going to do in the 7ms you saved?

> (And should I go back and change apply in my code everywhere or is this

*> likely to reverse again?)
*

It's not likely. apply is an R-level loop, and %*% is a C-level one. However, %*% is not supposed to be much slower than crossprod, and the devil is in the details of how the BLAS is implemented: the code is very similar.

That %*% was faster than apply has been true in all my (17 years) of S/R experience. Your childhood may predate S3, of course.

I still think one should use row/colSums for clarity with acceptable efficiency. It must be very unusual for these operations to be a dominant part of a calculation, so let's not lose proportion here.

*> Paul Gilbert
**>
**>
*

> Patrick Burns wrote:

*>
**>> The following is at least as much out of intellectual curiosity
**>> as for practical reasons.
**>> On reviewing some code written by novices to R, I came
**>> across:
**>>
**>> crossprod(x, y)[1,1]
**>>
**>> I thought, "That isn't a very S way of saying that, I wonder
**>> what the penalty is for using 'crossprod'." To my surprise the
**>> penalty was substantially negative. Handily the client had S-PLUS
**>> as well -- there the sign of the penalty was as I had expected, but
**>> the order of magnitude was off.
**>>
**>> Here are the timings of 1 million computations on vectors of
**>> length 1000. This is under Windows, R version 1.9.1 and S-PLUS
**>> 6.2 (on the same machine).
**>>
**>> Command R S-PLUS
**>> sum(x * y) 28.61 97.6
**>> crossprod(x, y)[1,1] 6.77 2256.2
**>>
**>>
**>> Another example is when computing the sums of the columns of a
**>> matrix. For example:
**>>
**>> set.seed(1)
**>> jjm <- matrix(rnorm(600), 5)
**>>
**>> Timings for this under Windows 2000 with R version 2.0.1 (on an
**>> old chip running at about 0.7Ghz) for 100,000 computations are:
**>>
**>> apply(jjm, 2, sum) 536.59
**>> colSums(jjm) 18.26
**>> rep(1,5) %*% jjm 15.41
**>> crossprod(rep(1,5), jjm) 13.16
**>>
**>> (These timings seem to be stable across R versions and on at least
**>> one Linux platform.)
**>>
**>> Andy Liaw showed another example of 'crossprod' being fast a couple
**>> days ago on R-help.
**>>
**>> Questions for those with a more global picture of the code:
**>>
**>> * Is the speed advantage of 'crossprod' inherent, or is it because
**>> more care has been taken with its implementation than the other
**>> functions?
**>>
**>> * Is 'crossprod' faster than 'sum(x * y)' because 'crossprod' is
**>> going to BLAS while 'sum' can't?
**>>
**>> * Would it make sense to (essentially) use 'crossprod' in
**>> 'colSums' and its friends at least for the special case of matrices?
**>>
**>> Patrick Burns
**>>
**>> Burns Statistics
**>> patrick@burns-stat.com
**>> +44 (0)20 8525 0696
**>> http://www.burns-stat.com
**>> (home of S Poetry and "A Guide for the Unwilling S User")
**>>
**>> ______________________________________________
**>> R-devel@stat.math.ethz.ch mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-devel
**>>
**>
**> ______________________________________________
**> R-devel@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-devel
**>
**>
*

-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-develReceived on Fri Jan 28 06:38:46 2005

*
This archive was generated by hypermail 2.1.8
: Fri 18 Mar 2005 - 09:02:42 EST
*