Re: [Rd] Fast Kendall's tau

From: Terry Therneau <therneau_at_mayo.edu>
Date: Wed, 27 Jun 2012 07:15:47 -0500

Note that the survConcordance function, which is equivalent to Kendall's tau, also is O(n log n) and it does compute a variance. The variance is about 4/5 of the work.

Using R 2.15.0 on an older Linux box:

 > require(survival)
 > require(pcaPP)
 > tfun <- function(n) {
+     x <- 1:n + runif(n)*n
+     y <- 1:n
+     t1 <- system.time(cor.test(x,y, method="kendall"))
+     t2 <- system.time(cor.fk(x,y))
+     t3 <- system.time(survConcordance(Surv(y) ~ x))
+     rbind("cor.test"=t1, "cor.fk"=t2, "survConcordance"= t3)
+ }
 > tfun(1e2)
                 user.self sys.self elapsed user.child sys.child
cor.test            0.000        0   0.004          0         0
cor.fk              0.000        0   0.001          0         0
survConcordance     0.004        0   0.006          0         0

 > tfun(1e3)
                 user.self sys.self elapsed user.child sys.child
cor.test            0.024        0   0.026          0         0
cor.fk              0.000        0   0.000          0         0
survConcordance     0.004        0   0.004          0         0

 > tfun(1e4)
                 user.self sys.self elapsed user.child sys.child
cor.test            2.224    0.004   2.227          0         0
cor.fk              0.004    0.000   0.003          0         0
survConcordance     0.028    0.000   0.028          0         0

 > tfun(5e4)
                 user.self sys.self elapsed user.child sys.child
cor.test           55.551    0.008  55.574          0         0
cor.fk              0.016    0.000   0.018          0         0
survConcordance     0.204    0.016   0.221          0         0

I agree with Brian, especially since the Spearman and Kendall results rarely (ever?) disagree on their main message for n>50. At the very most, one might add a footnote to the the help page for cor.test pointing to the faster codes.

Terry T.

Brian R wrote:
>> On 12-06-25 2:48 PM, Adler, Avraham wrote:
>>> Hello.
>>>
>>> Has any further action been taken regarding implementing David
>>> Simcha's fast Kendall tau code (now found in the package pcaPP as
>>> cor.fk) into R-base? It is literally hundreds of times faster,
>>> although I am uncertain as to whether he wrote code for testing the
>>> significance of the parameter. The last mention I have seen of this
>>> was in
>>> 2010<https://stat.ethz.ch/pipermail/r-devel/2010-February/056745.html>.
>> You could check the NEWS file, but I don't remember anything being done
>> along these lines. If the code is in a CRAN package, there doesn't seem
>> to be any need to move it to base R.
> In addition, this is something very specialized, and the code in R is
> fast enough for all but the most unusual instances of that specialized
> task. example(cor.fk) shows the R implementation takes well under a
> second for 2000 cases (a far higher value than is usual).
>

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 27 Jun 2012 - 12:21:56 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 28 Jun 2012 - 11:10:32 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive