Re: [Rd] O(N log N) Kendall Tau

From: Uwe Ligges <ligges_at_statistik.tu-dortmund.de>
Date: Sun, 13 Dec 2009 18:18:51 +0100

David Simcha wrote:
> I've noticed that the implementation of Kendall's Tau in R is O(N^2).
> The following reference describes how it can be done in O(N log N):
>
> A Computer Method for Calculating Kendall's Tau with Ungrouped Data
> William R. Knight
> Journal of the American Statistical Association, Vol. 61, No. 314, Part
> 1 (Jun., 1966), pp. 436-439
> http://www.jstor.org/pss/2282833
>
> I'm interested in contributing an implementation of this algorithm to
> R. However, I have a few questions:
>
> 1. Would it likely be accepted if well-implemented?
> 2. cov.c in the main/ directory of the R source tree seems to contain
> at least 4 different but nearly identical implementations of Kendall's
> Tau: cov_complete1, cov_complete2, cov_na1, and cov_na2. I can't tell
> why. The file is very difficult to read because of its lack of
> comments, (ab)use of macros and improper indentation. As I will
> probably need to modify this file, can some seasoned veteran please
> explain what the heck is going on in this file?

Well, it is not that hard to find that the functions you mentioned are all used (for different cases) in the main function

   do_cov
that is called by R's cov() function.

Given you provide a well tested implementation based on a published algorithm that is numerically as stable as the method already implemented in R, including all features, and gives identical results, I think it is very likely that your implementation will replace the one that is currently used in R.

Best,
Uwe Ligges

> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Sun 13 Dec 2009 - 17:21:25 GMT

This archive was generated by hypermail 2.2.0 : Mon 14 Dec 2009 - 08:31:04 GMT