[Rd] Long execution time for quantile() and difftime objects (PR#14091)

From: <hong.ooi_at_anz.com>
Date: Fri, 27 Nov 2009 06:55:10 +0100 (CET)


ink1">Full_Name: Hong Ooi
Version: 2.10.0
OS: Windows XP
Submission from: (NULL) (203.110.235.1)

While trying to get summary statistics on a duration variable (the difference between a start and end date), I ran into the following issue. Using summary or quantile (which summary calls) on a difftime object takes an extremely long time if the object is even moderately large.

A reproducible example:

> x <- as.Date(1:10000, origin="1900-01-01")
> x[1:10]

 [1] "1900-01-02" "1900-01-03" "1900-01-04" "1900-01-05" "1900-01-06"  [6] "1900-01-07" "1900-01-08" "1900-01-09" "1900-01-10" "1900-01-11"
> d <- x - as.Date("1900-01-01")
> d[1:10]

Time differences in days
 [1] 1 2 3 4 5 6 7 8 9 10
> system.time(summary(d[1:10]))

   user system elapsed
   0.01 0.00 0.01
> system.time(summary(d[1:100]))

   user system elapsed
   0.21 0.00 0.20
> system.time(summary(d[1:1000]))

   user system elapsed
   3.02 0.00 3.02
> system.time(summary(d[1:10000]))

   user system elapsed
  43.56 0.04 43.66

If I unclass d, there is no problem:

> system.time(summary(unclass(d[1:10000])))

   user system elapsed

      0 0 0

Testing with Rprof() indicates that the problem lies in [.difftime, although the code for that function seems innocuous enough.

> sessionInfo()

R version 2.10.0 (2009-10-26)
i386-pc-mingw32

locale:

[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 27 Nov 2009 - 12:35:17 GMT

This archive was generated by hypermail 2.2.0 : Fri 27 Nov 2009 - 14:20:50 GMT