Re: [Rd] Proposal unary - operator for factors

From: William Dunlap <wdunlap_at_tibco.com>
Date: Wed, 03 Feb 2010 16:43:26 -0800

> -----Original Message-----
> From: Duncan Murdoch [mailto:murdoch_at_stats.uwo.ca]
> Sent: Wednesday, February 03, 2010 4:17 PM
> To: William Dunlap
> Cc: Hadley Wickham; r-devel_at_r-project.org
> Subject: Re: [Rd] Proposal unary - operator for factors
>
> On 03/02/2010 6:49 PM, William Dunlap wrote:
> >> -----Original Message-----
> >> From: h.wickham_at_gmail.com [mailto:h.wickham_at_gmail.com] On
> >> Behalf Of Hadley Wickham
> >> Sent: Wednesday, February 03, 2010 3:38 PM
> >> To: William Dunlap
> >> Cc: r-devel_at_r-project.org
> >> Subject: Re: [Rd] Proposal unary - operator for factors
> >>
> >>> It wouldn't make sense in the context of
> >>> vector[-factor]
> >> True, but that doesn't work currently so you wouldn't lose
> anything.
> >> However, it would make a certain class of problem that
> used to throw
> >> errors become silent.
> >>
> >>> Wouldn't it be better to allow order's decreasing argument
> >>> to be a vector with one element per ... argument? That
> >>> would work for numbers, factors, dates, and anything
> >>> else. Currently order silently ignores decreasing[2] and
> >>> beyond.
> >> The problem is you might want to do something like
> order(a, -b, c, -d)
> >
> > Currently, for numeric a you can do either
> > order(-a)
> > or
> > order(a, decreasing=FALSE)
> > For nonnumeric types like POSIXct and factors only
> > the latter works.
> >
> > Under my proposal your
> > order(a, -b, c, d)
> > would be
> > order(a, b, c, d, decreasing=c(FALSE,TRUE,FALSE,TRUE))
> > and it would work for any ordably class without modifications
> > to any classes.
>
> Why not use
>
> order(a, -xtfrm(b), c, -xtfrm(d))
>
> ??

You could, if you can remember it. I have been annoyed that decreasing= was in order() but not as useful as it could be since it is not vectorized. The same goes for na.last, although that seems less useful to me.

Here is a version of order (based on the algorithm using in S+'s order) that
vectorizes the na.last and decreasing
arguments. It calls the existing order
function to implement decreasing=TRUE/FALSE and na.last=TRUE/FALSE for a single argument but order itself could be mofified in this way.

new.order <- function (..., na.last = TRUE, decreasing = FALSE) {

    vectors <- list(...)
    nVectors <- length(vectors)
    stopifnot(nVectors > 0)
    na.last <- rep(na.last, length = nVectors)     decreasing <- rep(decreasing, length = nVectors)     keys <- seq_len(length(vectors[[1]]))     for (i in nVectors:1) {

        v <- vectors[[i]]
        if (length(v) < length(keys)) 
            v <- rep(v, length = length(keys))
        keys <- keys[order(v[keys], na.last = na.last[i], decreasing =
decreasing[i])]

    }
    keys
}

With the following dataset

data <- data.frame(
  ct = as.POSIXct(c("2009-01-01", "2010-02-03", "2010-02-28"))[c(2,2,2,3,3,1)],
  dt = as.Date(c("2009-01-01", "2010-02-03", "2010-02-28"))[c(3,2,2,2,3,1)],
  fac = factor(c("Small","Medium","Large"), levels=c("Small","Medium","Large"))[c(1,3,2,3,3,1)],   n = c(11,12,12,11,12,12))

> data

          ct dt fac n

1 2010-02-03 2010-02-28  Small 11
2 2010-02-03 2010-02-03  Large 12
3 2010-02-03 2010-02-03 Medium 12
4 2010-02-28 2010-02-03  Large 11
5 2010-02-28 2010-02-28  Large 12
6 2009-01-01 2009-01-01  Small 12

> data.frame(lapply(data,rank))

   ct dt fac n

1 3.0 5.5 1.5 1.5
2 3.0 3.0 5.0 4.5
3 3.0 3.0 3.0 4.5
4 5.5 3.0 5.0 1.5
5 5.5 5.5 5.0 4.5
6 1.0 1.0 1.5 4.5

we get (where my demos use rank because I could remember the name xtfrm):

> with(data, identical(order(ct,dt), new.order(ct,dt)))
[1] TRUE
> with(data, identical(order(fac,-n),

new.order(fac,n,decreasing=c(FALSE,TRUE)))) [1] TRUE
> with(data, identical(order(ct,-rank(dt)),
new.order(ct,dt,decreasing=c(FALSE,TRUE)))) [1] TRUE
> with(data, identical(order(ct,-rank(fac)),
new.order(ct,fac,decreasing=c(FALSE,TRUE)))) [1] TRUE
> with(data, identical(order(n,-rank(fac)),
new.order(n,fac,decreasing=c(FALSE,TRUE)))) [1] TRUE Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> Duncan Murdoch
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 04 Feb 2010 - 00:45:25 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 04 Feb 2010 - 04:30:20 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive