Re: [Rd] There is pmin and pmax each taking na.rm, how about psum?

From: Justin Talbot <jtalbot_at_stanford.edu>
Date: Wed, 31 Oct 2012 08:38:33 -0700


> Because that's inconsistent with pmin and pmax when two NAs are summed.
>
> x = c(1,3,NA,NA,5)
> y = c(2,NA,4,NA,1)
> colSums(rbind(x, y), na.rm = TRUE)
> [1] 3 3 4 0 6 # actual
> [1] 3 3 4 NA 6 # desired

But your desired result would be inconsistent with sum: sum(NA,NA,na.rm=TRUE)
[1] 0

>From a language definition perspective I think having psum return 0
here is right choice. R consistently distinguishes between operators that have a sensible identity (+:0, *:1, &:TRUE, |:FALSE) which return the identity if removing NAs results in no items, and those that kind of don't (pmin, pmax) which return NA. Let's not break that.

(I would argue that pmin and pmax should return their actual identities too: Inf and -Inf respectively, but I can understand the current behavior.)

My 2 cents on psum:

R has a natural set of associative & commutative operators: +, *, &, |, pmin, pmax.

These correspond directly to the reduction functions: sum, prod, all, any, min, max

The current problem is that pmin and pmax are more powerful than +, *, &, and |. The right fix is to extend the rest of the associative & commutative operators to have the same power as pmin and pmax.

Thus, + should have the signature: `+`(..., na.rm=FALSE), which would allow you to do things like:

`+`(c(1,2),c(1,2),c(1,2),NA, na.rm=TRUE) = c(3,6)

If you don't like typing `+`, you could always alias psum to `+`.

Additionally, R currently has two simple reduction functions that don't have corresponding operators: range and length. Having a prange operator and a plength operator would nicely round out the language.

Justin



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 01 Nov 2012 - 13:14:38 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 01 Nov 2012 - 16:00:51 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive