Re: [Rd] [R] unvectorized option for outer()

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Mon 31 Oct 2005 - 13:12:35 GMT


> From: Thomas Lumley
>
> On Sun, 30 Oct 2005, Jonathan Rougier wrote:
>
> > I'm not sure about this. Perhaps I am a dinosaur, but my feeling is
> > that if people are writing functions in R that might be subject to
> > simple operations like outer products, then they ought to be writing
> > vectorised functions!
>
> I would agree. How about an oapply() function that does
> multiway (rather
> than just two-way) outer products. Basing the name on "apply" would
> emphasize the similarity to other flexible, not particularly
> optimized
> second-order functions.
>
> -thomas

I'll toss in my $0.02: The following is my attempt at creating a "general outer" that works with more than two arguments.

gouter <- function (x, FUN, ...) {

    xgrid <- as.list(do.call("expand.grid", x))     names(xgrid) <- NULL
    array(do.call(deparse(substitute(FUN)), c(xgrid, list(...))),

        dim = sapply(x, length), dimnames = x) }

Here's an example:

> example(gouter)

gouter> gouter(list(letters[1:2], 1:3, 2:4), paste, sep = "") , , 2

  1 2 3
a "a12" "a22" "a32"
b "b12" "b22" "b32"

, , 3

  1 2 3
a "a13" "a23" "a33"
b "b13" "b23" "b33"

, , 4

  1 2 3
a "a14" "a24" "a34"
b "b14" "b24" "b34"

Andy

>
>
> > Maybe it's not possible to hold this line, and
> > maybe "outer" is not the right place to draw it, but I
> think we ought to
> > respect the "x is a vector" mindset as much as possible in the base
> > package. As Tony notes, the documentation does try to be
> clear about
> > what outer actually does, and how it can be used.
> >
> > So I am not a fan of the VECTORIZED argument, and
> definitely not a fan
> > of the VECTORIZED=FALSE default.
> >
> > Jonathan.
> >
> > Gabor Grothendieck wrote:
> >> If the default were changed to VECTORIZED=FALSE then it would
> >> still be functionally compatible with what we have now so
> all existing
> >> software would continue to run correctly yet would not cause
> >> problems for the unwary. Existing software would not have
> to be changed
> >> to add VECTORIZED=TRUE except for those, presumbly few, cases
> >> where outer performance is critical. One optimization might be to
> >> have the default be TRUE if the function is * or perhaps
> if its specified
> >> as a single character and FALSE otherwise.
> >>
> >> Having used APL I always regarded the original design of
> outer in R as
> >> permature performance optimization and this would be a
> chance to get
> >> it right.
> >>
> >> On 10/28/05, Tony Plate <tplate@acm.org> wrote:
> >>
> >>> [following on from a thread on R-help, but my post here seems more
> >>> appropriate to R-devel]
> >>>
> >>> Would a patch to make outer() work with non-vectorized
> functions be
> >>> considered? It seems to come up moderately often on the
> list, which
> >>> probably indicates that many many people get bitten by the same
> >>> incorrect expectation, despite the documentation and the
> FAQ entry. It
> >>> looks pretty simple to modify outer() appropriately: one
> extra function
> >>> argument and an if-then-else clause to call mapply(FUN,
> ...) instead of
> >>> calling FUN directly.
> >>>
> >>> Here's a function demonstrating this:
> >>>
> >>> outer2 <- function (X, Y, FUN = "*", ..., VECTORIZED=TRUE)
> >>> {
> >>> no.nx <- is.null(nx <- dimnames(X <- as.array(X)))
> >>> dX <- dim(X)
> >>> no.ny <- is.null(ny <- dimnames(Y <- as.array(Y)))
> >>> dY <- dim(Y)
> >>> if (is.character(FUN) && FUN == "*") {
> >>> robj <- as.vector(X) %*% t(as.vector(Y))
> >>> dim(robj) <- c(dX, dY)
> >>> }
> >>> else {
> >>> FUN <- match.fun(FUN)
> >>> Y <- rep(Y, rep.int(length(X), length(Y)))
> >>> if (length(X) > 0)
> >>> X <- rep(X, times = ceiling(length(Y)/length(X)))
> >>> if (VECTORIZED)
> >>> robj <- FUN(X, Y, ...)
> >>> else
> >>> robj <- mapply(FUN, X, Y, MoreArgs=list(...))
> >>> dim(robj) <- c(dX, dY)
> >>> }
> >>> if (no.nx)
> >>> nx <- vector("list", length(dX))
> >>> else if (no.ny)
> >>> ny <- vector("list", length(dY))
> >>> if (!(no.nx && no.ny))
> >>> dimnames(robj) <- c(nx, ny)
> >>> robj
> >>> }
> >>> # Some examples
> >>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
> >>> outer2(1:2, 3:5, f, 2)
> >>> outer2(numeric(0), 3:5, f, 2)
> >>> outer2(1:2, numeric(0), f, 2)
> >>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
> >>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
> >>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
> >>>
> >>> # Output on examples
> >>>
> >>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
> >>>> outer2(1:2, 3:5, f, 2)
> >>>
> >>> in f
> >>> [,1] [,2] [,3]
> >>> [1,] 9 16 25
> >>> [2,] 36 64 100
> >>>
> >>>> outer2(numeric(0), 3:5, f, 2)
> >>>
> >>> in f
> >>> [,1] [,2] [,3]
> >>>
> >>>> outer2(1:2, numeric(0), f, 2)
> >>>
> >>> in f
> >>>
> >>> [1,]
> >>> [2,]
> >>>
> >>>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
> >>>
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>> in f
> >>> [,1] [,2] [,3]
> >>> [1,] 9 16 25
> >>> [2,] 36 64 100
> >>>
> >>>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
> >>>
> >>> [,1] [,2] [,3]
> >>>
> >>>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
> >>>
> >>> [1,]
> >>> [2,]
> >>>
> >>> If a patch to add this feature would be considered, I'd
> be happy to
> >>> submit one (including documentation). If so, and if there are any
> >>> potential traps I should bear in mind, please let me know!
> >>>
> >>> -- Tony Plate
> >>>
> >>> Rau, Roland wrote:
> >>>
> >>>> Dear all,
> >>>>
> >>>> a big thanks to Thomas Lumley, James Holtman and Tony
> Plate for their
> >>>> answers. They all pointed in the same direction => I
> need a vectorized
> >>>> function to be applied. Hence, I will try to work with a
> 'wrapper'
> >>>> function as described in the FAQ.
> >>>>
> >>>> Thanks again,
> >>>> Roland
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Thomas Lumley [mailto:tlumley@u.washington.edu]
> >>>>> Sent: Thursday, October 27, 2005 11:39 PM
> >>>>> To: Rau, Roland
> >>>>> Cc: r-help@stat.math.ethz.ch
> >>>>> Subject: Re: [R] outer-question
> >>>>>
> >>>>>
> >>>>> You want FAQ 7.17 Why does outer() behave strangely
> with my function?
> >>>>>
> >>>>> -thomas
> >>>>>
> >>>>> On Thu, 27 Oct 2005, Rau, Roland wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Dear all,
> >>>>>>
> >>>>>> This is a rather lengthy message, but I don't know what I
> >>>>>
> >>>>> made wrong in
> >>>>>
> >>>>>
> >>>>>> my real example since the simple code works.
> >>>>>> I have two variables a, b and a function f for which I
> would like to
> >>>>>> calculate all possible combinations of the values of a and b.
> >>>>>> If f is multiplication, I would simply do:
> >>>>>>
> >>>>>> a <- 1:5
> >>>>>> b <- 1:5
> >>>>>> outer(a,b)
> >>>>>>
> >>>>>> ## A bit more complicated is this:
> >>>>>> f <- function(a,b,d) {
> >>>>>> return(a*b+(sum(d)))
> >>>>>> }
> >>>>>> additional <- runif(100)
> >>>>>> outer(X=a, Y=b, FUN=f, d=additional)
> >>>>>>
> >>>>>> ## So far so good. But now my real example. I would
> like to plot the
> >>>>>> ## log-likelihood surface for two parameters alpha and beta of
> >>>>>> ## a Gompertz distribution with given data
> >>>>>>
> >>>>>> ### I have a function to generate random-numbers from a
> >>>>>> Gompertz-Distribution
> >>>>>> ### (using the 'inversion method')
> >>>>>>
> >>>>>> random.gomp <- function(n, alpha, beta) {
> >>>>>> return( (log(1-(beta/alpha*log(1-runif(n)))))/beta)
> >>>>>> }
> >>>>>>
> >>>>>> ## Now I generate some 'lifetimes'
> >>>>>> no.people <- 1000
> >>>>>> al <- 0.1
> >>>>>> bet <- 0.1
> >>>>>> lifetimes <- random.gomp(n=no.people, alpha=al, beta=bet)
> >>>>>>
> >>>>>> ### Since I neither have censoring nor truncation in this
> >>>>>
> >>>>> simple case,
> >>>>>
> >>>>>
> >>>>>> ### the log-likelihood should be simply the sum of the
> log of the
> >>>>>> ### the densities (following the parametrization of
> >>>>>
> >>>>> Klein/Moeschberger
> >>>>>
> >>>>>
> >>>>>> ### Survival Analysis, p. 38)
> >>>>>>
> >>>>>> loggomp <- function(alphas, betas, timep) {
> >>>>>> return(sum(log(alphas) + betas*timep + (alphas/betas *
> >>>>>> (1-exp(betas*timep)))))
> >>>>>> }
> >>>>>>
> >>>>>> ### Now I thought I could obtain a matrix of the
> >>>>>
> >>>>> log-likelihood surface
> >>>>>
> >>>>>
> >>>>>> ### by specifying possible values for alpha and beta
> with the given
> >>>>>> data.
> >>>>>> ### I was able to produce this matrix with two for-loops.
> >>>>>
> >>>>> But I thought
> >>>>>
> >>>>>
> >>>>>> ### I could use also 'outer' in this case.
> >>>>>> ### This is what I tried:
> >>>>>>
> >>>>>> possible.alphas <- seq(from=0.05, to=0.15, length=30)
> >>>>>> possible.betas <- seq(from=0.05, to=0.15, length=30)
> >>>>>>
> >>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
> >>>>>
> >>>>> timep=lifetimes)
> >>>>>
> >>>>>
> >>>>>> ### But the result is:
> >>>>>>
> >>>>>>
> >>>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
> >>>>>>
> >>>>>> timep=lifetimes)
> >>>>>> Error in outer(X = possible.alphas, Y = possible.betas, FUN
> >>>>>
> >>>>> = loggomp,
> >>>>>
> >>>>>
> >>>>>> :
> >>>>>> dim<- : dims [product 900] do not match the length
> >>>>>
> >>>>> of object [1]
> >>>>>
> >>>>>
> >>>>>> In addition: Warning messages:
> >>>>>> ...
> >>>>>>
> >>>>>> ### Can somebody give me some hint where the problem is?
> >>>>>> ### I checked my definition of 'loggomp' but I thought this
> >>>>>
> >>>>> looks fine:
> >>>>>
> >>>>>
> >>>>>> loggomp(alphas=possible.alphas[1], betas=possible.betas[1],
> >>>>>> timep=lifetimes)
> >>>>>> loggomp(alphas=possible.alphas[4], betas=possible.betas[10],
> >>>>>> timep=lifetimes)
> >>>>>> loggomp(alphas=possible.alphas[3], betas=possible.betas[11],
> >>>>>> timep=lifetimes)
> >>>>>>
> >>>>>>
> >>>>>> ### I'd appreciate any kind of advice.
> >>>>>> ### Thanks a lot in advance.
> >>>>>> ### Roland
> >>>>>>
> >>>>>>
> >>>>>> +++++
> >>>>>> This mail has been sent through the MPI for Demographic
> >>>>>
> >>>>> Rese...{{dropped}}
> >>>>>
> >>>>>
> >>>>>> ______________________________________________
> >>>>>> R-help@stat.math.ethz.ch mailing list
> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>> PLEASE do read the posting guide!
> >>>>>
> >>>>> http://www.R-project.org/posting-guide.html
> >>>>>
> >>>>> Thomas Lumley Assoc. Professor, Biostatistics
> >>>>> tlumley@u.washington.edu University of Washington, Seattle
> >>>>>
> >>>>
> >>>>
> >>>> +++++
> >>>> This mail has been sent through the MPI for Demographic
> Research. Should you receive a mail that is apparently from
> a MPI user without this text displayed, then the address has
> most likely been faked. If you are uncertain about the
> validity of this message, please check the mail header or ask
> your system administrator for assistance.
> >>>>
> >>>>
> >>>
> >>> ______________________________________________
> >>> R-help@stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >>>
> >>
> >>
> >>
> ______________________________________________
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > --
> > Jonathan Rougier Science Laboratories
> > Department of Mathematical Sciences South Road
> > University of Durham Durham DH1 3LE
> > tel: +44 (0)191 334 3111, fax: +44 (0)191 334 3051
> > http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> Thomas Lumley Assoc. Professor, Biostatistics
> tlumley@u.washington.edu University of Washington, Seattle
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue Nov 01 00:17:27 2005

This archive was generated by hypermail 2.1.8 : Mon 20 Feb 2006 - 03:21:31 GMT