Re: [Rd] POSIXlt matching bug

From: Martin Maechler <maechler_at_stat.math.ethz.ch>
Date: Fri, 02 Jul 2010 17:05:33 +0200

>>>>> "MM" == Martin Maechler <maechler_at_stat.math.ethz.ch> >>>>> on Fri, 2 Jul 2010 12:22:07 +0200 writes:

>>>>> "RobMcG" == McGehee, Robert <Robert.McGehee_at_geodecapital.com> >>>>> on Tue, 29 Jun 2010 10:46:06 -0400 writes:

    RobMcG> I came across the below mis-feature/bug using match with POSIXlt objects     RobMcG> (from strptime) in R 2.11.1 (though this appears to be an old issue).

>>> x <- as.POSIXlt(Sys.Date())
>>> table <- as.POSIXlt(Sys.Date()+0:5)
>>> length(x)
    RobMcG> [1] 1
>>> x %in% table # I expect TRUE
    RobMcG> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>> match(x, table) # I expect 1
    RobMcG> [1] NA NA NA NA NA NA NA NA NA

    RobMcG> This behavior seemed more plausible when the length of a POSIXlt object
    RobMcG> was 9 (back in the day), however since the length was redefined, the
    RobMcG> length of x no longer matches the length of the match function output,
    RobMcG> as specified by the ?match documentation: "A vector of the same length
    RobMcG> as 'x'".

    RobMcG> I would normally suggest that we add a POSIXlt method for match that
    RobMcG> converts x into POSIXct or character first. However, match does not
    RobMcG> appear to be generic. Below is a possible rewrite of match that appears     RobMcG> to work as desired.

    RobMcG> match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)

    RobMcG> .Internal(match(if(is.factor(x)||inherits(x, "POSIXlt"))
    RobMcG> as.character(x) else x,
    RobMcG> if(is.factor(table)||inherits(table, "POSIXlt"))
    RobMcG> as.character(table) else table,
    RobMcG> nomatch, incomparables))

    RobMcG> That said, I understand some people may be very sensitive to the speed     RobMcG> of the match function,

    MM> yes, indeed.

    MM> I'm currently investigating an alternative, considerably more
    MM> programming time, but in the end should loose much less speed,
    MM> is to  .Internal()ize the tests in C code,
    MM> so that the resulting R code would simply be

    MM> match <- function(x, table, nomatch = NA_integer_, incomparables = NULL)     MM> .Internal(x, table, nomatch, incomparables)

I have committed such a change to R-devel, to be 2.12.x. This should mean that match() actually is now very slightly faster than it used to be.
The speed gain may not be measurable though.

Martin Maechler, ETH Zurich

    RobMcG> and may prefer a simple change to the ?match     RobMcG> documentation noting this (odd) behavior for POSIXlt.

    RobMcG> Thanks, Robert

    RobMcG> R.version
    RobMcG> _                            
    RobMcG> platform       x86_64-unknown-linux-gnu     
    RobMcG> arch           x86_64                       
    RobMcG> os             linux-gnu                    
    RobMcG> system         x86_64, linux-gnu            
    RobMcG> status                                      
    RobMcG> major          2                            
    RobMcG> minor          11.1                         
    RobMcG> year           2010                         
    RobMcG> month          05                           
    RobMcG> day            31                           
    RobMcG> svn rev        52157                        
    RobMcG> language       R                            
    RobMcG> version.string R version 2.11.1 (2010-05-31)

    RobMcG> Robert McGehee, CFA
    RobMcG> Geode Capital Management, LLC
    RobMcG> One Post Office Square, 28th Floor | Boston, MA | 02109
    RobMcG> Tel: 617/392-8396 Fax:617/476-6389     RobMcG> mailto:robert.mcgehee_at_geodecapital.com

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 02 Jul 2010 - 15:09:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 02 Jul 2010 - 17:30:11 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive