From: Tim Hesterberg <timh_at_insightful.com>

Date: Thu, 07 Sep 2006 09:47:10 -0700

R-help_at_stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Sep 2006 - 16:51:54 GMT

Date: Thu, 07 Sep 2006 09:47:10 -0700

toby_marks_at_americancentury.com asked:

>I am trying to divide the columns of a matrix by the first row in the
>matrix.

Dividing columns of a matrix by a vector is a pretty fundamental operation, and the query resulted in a large number of suggestions:

x/matrix(v, nrow(x), ncol(x), byrow = TRUE))
sweep(x, 2, v, "/")

x / rep(v, each = nrow(x))

x / outer(rep(1, nrow(x)), v)

x %*% diag(1/v)

t(apply(x, 1, function(x) x/v))

x/rep(v, each=nrow(x))

t(apply(x, 1, "/", v))

library(reshape); iapply(x, 1, "/", v) # R only
t(t(x)/v)

scale(x, center = FALSE, v) # not previously suggested

It is unsatisfactory when such a fundamental operation is
done in so many different ways.

* It makes it hard to read other people's code.

- Some of these are very inefficient.

I propose to create standard functions and possibly operator forms for this and similar operators:

colPlus(x, v) x %c+% v colMinus(x, v) x %c-% v colTimes(x, v) x %c*% v colDivide(x, v) x %c/% v colPower(x, v) x %c^% v

Goals are:

* more readable code

- generic functions, with methods for objects such as data frames and S-PLUS bigdata objects (this would be for both S-PLUS and R)
- efficiency -- use the fastest of the above methods, or drop to C to avoid replicating v.
- allow error checking (that length of v matches number of columns of x)

I'd like feedback (to me, I'll summarize for the list) on:

** the suggestion in general
*

- are names like "colPlus" OK, or do you have other suggestions?
- create both functions and operators, or just the functions?
- should there be similar operations for rows?

Note: similar operations for rows are not usually needed, because

x * v # e.g. where v = colMeans(x)

is equivalent to (but faster than)

x * rep(v, length = length(x))

The advantage would be that

colTimes(x, v)

could throw an error if length(v) != nrow(x)

Tim Hesterberg

P.S. Of the suggestions, my preference is

a / rep(v, each=nrow(a))

It was to support this and similar +-*^ operations that I originally
added the "each" argument to rep.

| Tim Hesterberg Research Scientist | | timh_at_insightful.com Insightful Corp. | | (206)802-2319 1700 Westlake Ave. N, Suite 500 | | (206)283-8691 (fax) Seattle, WA 98109-3044, U.S.A. | | www.insightful.com/Hesterberg | ========================================================Download the S+Resample library from www.insightful.com/downloads/libraries

R-help_at_stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 07 Sep 2006 - 16:51:54 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Mon 27 Aug 2007 - 21:34:29 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*