Re: [Rd] Native implementation of rowMedians()

From: Robert Gentleman <rgentlem_at_fhcrc.org>
Date: Mon, 14 May 2007 07:17:19 -0700

We did think about this a lot, and decided it was better to have something like rowQ, which really returns requested order statistics, letting the user manipulate them on the return for their own version of median, or other quantiles, was a better approach. I would be happy to have this in R itself, if there is sufficient interest and we can remove the one in Biobase (without the need for deprecation/defunct as long as the args are compatible). But, if the decision is to return a particular estimate of a quantile, then we would probably want to keep our function around, with its current name.

best wishes

   Robert

Martin Maechler wrote:

>>>>>> "BDR" == Prof Brian Ripley <ripley_at_stats.ox.ac.uk>
>>>>>>     on Mon, 14 May 2007 11:39:18 +0100 (BST) writes:

>
> BDR> On Mon, 14 May 2007, Henrik Bengtsson wrote:
> >> On 5/14/07, Prof Brian Ripley <ripley_at_stats.ox.ac.uk> wrote:
> >>>
> >>> > Hi Henrik,
> >>> >>>>>> "HenrikB" == Henrik Bengtsson <hb_at_stat.berkeley.edu>
> >>> >>>>>> on Sun, 13 May 2007 21:14:24 -0700 writes:
> >>> >
> >>> > HenrikB> Hi,
> >>> > HenrikB> I've got a version of rowMedians(x, na.rm=FALSE) for
> >>> matrices that
> >>> > HenrikB> handles missing values implemented in C. It has been
>
> BDR> [...]
>
> >>> Also, the 'a version of rowMedians' made me wonder what other version
> >>> there was, and it seems there is one in Biobase which looks a more
> >>> natural home.
> >>
> >> The rowMedians() in Biobase utilizes rowQ() in ditto. I actually
> >> started of by adding support for missing values to rowQ() resulting in
> >> the method rowQuantiles(), for which there are also internal functions
> >> for both integer and double matrices. rowQuantiles() is in R.native
> >> too, but since it has much less CPU milage I wanted to wait with that.
> >> The rowMedians() is developed from my rowQuantiles() optimized for
> >> the 50% quantile.
> >>
> >> Why do you think it is more natural to host rowMedians() in Biobase
> >> than in one of the core R packages? Biobase comes with a lot of
> >> overhead for people not in the Bio-world.
>
> BDR> Because that is where there seems to be a need for it, and having multiple
> BDR> functions of the same name in different packages is not ideal (and even
> BDR> with namespaces can cause confusion).
>
> That's correct, of course.
> However, I still think that quantiles (and statistics derived
> from them) in general and medians in particular are under-used
> by many user groups. For some useRs, speed can be an important
> reason and for that I had made a big effort to provide runmed()
> in R, and I think it would be worthwhile to provide fast rowwise
> medians and quantiles, here as well.

>
> Also, BTW, I think it will be worthwhile to provide (R<->C) API
> versions of median() and quantile() {with less options than the
> R functions, most probably!!},
> such that we'd hopefully see less re-invention of the wheel
> happening in every package that needs such quantiles in its C code.
>
> Biobase is in quite active maintenance, and I'd assume its
> maintainers will remove rowMedians() from there (or first
> replace it with a wrapper in order to deal with the namespace
> issue you mentioned) as soon as R has its own function
> with the same (or better) functionality.
> In order to facilitate the transition, we'd have to make sure
> that such a 'stats' function does behave " >= " to the bioBase
> one.
>
> Martin
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem_at_fhcrc.org

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Mon 14 May 2007 - 14:21:54 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 May 2007 - 16:33:46 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.