New package: colSums

About this list Date view Thread view Subject view Author view Attachment view

From: David Brahm (brahm@alum.mit.edu)
Date: Tue 08 Jan 2002 - 04:16:08 EST


Message-id: <15417.55256.783769.400491@gargle.gargle.HOWL>


I've uploaded a package colSums_1.0.tar.gz to CRAN /src/contrib/Devel. It
contains functions colSums, colMeans, colVars, colStdevs, rowSums, rowMeans,
rowVars, and rowStdevs. These do simple, fast arithmetic on columns/rows of a
matrix, or more generally across dimensions of an array, e.g. colSums(m) =
apply(m, 2, sum) but faster. They should be compatible with the corresponding
S-Plus functions.

The core code was written by Doug Bates <bates@stat.wisc.edu>, as package
"MatUtils", and posted to R-help on July 19, 2001. Many thanks to Doug, Peter
Dalgaard <p.dalgaard@biostat.ku.dk>, Thomas Lumley <tlumley@u.washington.edu>,
and Prof Brian Ripley <ripley@stats.ox.ac.uk> for their assistance; see the
R-help thread "colSums in C".

- Brian Ripley has taken up the torch to improve these functions further and
  put them in R-devel, hopefully (I think) for R-1.5.0. That's why my package
  is in /src/contrib/Devel; it should be obsolete by the next release of R! I
  expect complete argument compatibility, though, since we have both tried to
  be compatible with S-Plus.

- I detect NA's with a trick that Thomas Lumley suggested (it's not NA or NaN
  if x==x), which works really great for me, but may not be portable. If it
  doesn't work for you, try modifying the #define line in colSums.c. I would
  like to hear about platforms where this fails.

- colVars is very naive, e.g. I'm probably exacerbating roundoff error when
  mu >> sigma. I personally don't worry because in finance, mu (return) is
  never >> sigma (risk) :-). The S-Plus documentation for colVars claims they
  do something fancy with the "two-pass method described in Chan, Golub, and
  LeVeque (1983)" that I don't know anything about.

- I convert integers, logicals, and complex immediately to reals, which is
  probably inefficient. Brian Ripley's version seems to do a better job.

- S-Plus does not have "rowStdevs" for reasons unknown, since it is simply
  defined as rowStdevs(x, ...) <- function(x, ...) sqrt(rowVars(x, ...)).

- For me, colSums is about 23x faster than apply (on a 400 x 40000 matrix).

-- 
                              -- David Brahm (brahm@alum.mit.edu)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-announce mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-announce-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Thu 17 Jan 2002 - 11:16:02 EST