Re: which() does not handle NAs in named vectors. (PR#226)

About this list Date view Thread view Subject view Author view Other groups

Subject: Re: which() does not handle NAs in named vectors. (PR#226)
From: Martin Maechler (maechler@stat.math.ethz.ch)
Date: Thu 15 Jul 1999 - 22:57:22 EST


Message-Id: <199907151257.OAA06166@sophie.ethz.ch>

>>>>> On Thu, 15 Jul 1999 09:14, ripley@stats.ox.ac.uk (Brian D. Ripley) said:

Thank you for the bug report

BDR> -- It is unclear to me that the handling of NAs is desirable, and
BDR> it has problems with names:

{function which in its present form very much evolved out of user wishes...}

BDR> z <- c(T,T,NA,F,T)
BDR> names(z) <- letters[1:5]
BDR> which(z)
BDR> Error: names attribute must be the same length as the vector

fixed for release-patches [available in a day or two from CRAN src/devel/]
and hence every new release.

BDR> (Why do the vector and its names have different subscripts? And
BDR> while you are correcting this,

BDR> Arguments:

BDR> x: a logical vector or array. `NA's are allowed an
BDR> omitted.

is now

       x: a `logical' vector or array. `NA's are allowed
          and omitted (treated as if `FALSE').

  
BDR> has a typo, and the logic can be simplified: see below.)

BDR> On Thu, 15 Jul 1999, Martin Maechler wrote:

>> >>>>> "BDR" == Prof Brian D Ripley <ripley@stats.ox.ac.uk> writes:
>>
BDR> On Wed, 14 Jul 1999, Friedrich Leisch wrote:
>> >> >>>>> On Wed, 14 Jul 1999 04:09:21, >>>>> Peter B Mandeville
>> (PBM) >> wrote:
>> >>
PBM> I have a vector Pes with 600 elements some of which are NA's. How
PBM> can I form a vector of the indices of the NA's.
>> >>
PBM> for(i in 1:600) if(is.na(Pes[i])) print(i)
>> >>
PBM> prints the indices of the NA's but I can't figure out how to put
PBM> the results in a vector.
>> >> try this:
>> >>
>> >> x <- (1:length(Pes))[is.na(Pes)]
>>
BDR> Tip: that sort of thing often fails for a length 0 vector. The
BDR> `approved' spell is
>>
BDR> seq(along=Pes)[is.na(Pes)]

BTW, currently seq(along = x) returns "numeric" ("double")
whereas 1:length(x) returns "integer".
I'm about to fix this...

BDR> In this case it does not matter as the subscript is of length 0,
BDR> but it has floored enough library/package writers to be worth
BDR> thinking about.
>> Good teaching about seq() vs. 1:n
>>
>> However, the solution I gave
>>
>> which(is.na(Pes))
>>
>> is the one I stilly really recommend; it does deal with 0-length
>> objects, and it keeps names when there are some, and it has an
>> `arr.ind = FALSE' argument to return array indices instead of vector
>> indices when so desired.

BDR> Yes, but

BDR> -- It is not in S (so causing difficulty in porting from R to S)

Well, I know what you mean and your point is all well in the above case...
but anyway:
Our group here has been using this ("which" function) in S for quite a while and
eventually, someone will have to collect a library of things from R, missing in
S-plus and easily implementable.

And then, for quite a few R users, S-plus backward compatibility is not the
big issue. Locally, in our collection of S-plus add-ons, we've got already
quite a few of them..
And in other ways, R is so much nicer
    - math annotation in graphics
    - color, line types { plot(x,y, col="light blue", col.main = "blue") }
    - filled.contour
    - persp() with shading..

I think if you want to live in both worlds, I want (and recommend) to use

    if(is.R()) {

       ...R specific...

    }
    else { ## S-plus ---

       ...S-plus specific...

    }

anyway, even within user written functions
and make sure (via .First or S_FIRST or ...) that is.R() |--> FALSE in S-plus

BDR> -- It looks a relatively expensive operation.

I don't think it is expensive (for arr.ind=FALSE !) if you want to do deal
with missings (NA) at all. (Peter's example above is one of the few places
where you are absolutely sure there are no missings...)
Assume x has some NAs, e.g.
    x <- rnorm(1000); x[1000*runif(rpois(1,lam=50))] <- NA
Then
    which( x < -2 )

works how one would want;

    seq(along = x)[x < -2]

gives silly NA's (which make sense for the logical vector but not for the
                 extraction).

BDR> -- Internally which could be simplified by using seq(along=) as it is a wrapper for
BDR> this construct, but actually the separate handling of n == 0 is
BDR> unnecessary (as logic & !is.na(logic) will have length zero.)

You are right, and that's part of the fix for `which' which is currently

which <- function(logic, arr.ind = FALSE)
{
    if(!is.logical(logic))
        stop("argument to \"which\" is not logical")
    wh <- seq(along=logic)[ll <- logic & !is.na(logic)]
    if ((m <- length(wh)) > 0) {
        dl <- dim(logic)
        if (is.null(dl) || !arr.ind) {
            names(wh) <- names(logic)[ll]
        }
        else { ##-- return a matrix length(wh) x rank
            rank <- length(dl)
            wh1 <- wh - 1
            wh <- 1 + wh1 %% dl[1]
            wh <- matrix(wh, nrow = m, ncol = rank,
                         dimnames =
                         list(dimnames(logic)[[1]][wh],
                              if(rank == 2) c("row", "col")# for matrices
                              else paste("dim", 1:rank, sep="")))
            if(rank >= 2) {
                denom <- 1
                for (i in 2:rank) {
                    denom <- denom * dl[i-1]
                    nextd1 <- wh1 %/% denom# (next dim of elements) - 1
                    wh[,i] <- 1 + nextd1 %% dl[i]
                }
            }
            storage.mode(wh) <- "integer"
        }
    }
    wh
}
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b25 : Tue 04 Jan 2000 - 14:16:06 EST