Re: [Rd] Surprising length() of POSIXlt vector (PR#14073)

From: William Dunlap <wdunlap_at_tibco.com>
Date: Thu, 19 Nov 2009 21:36:32 -0800

> -----Original Message-----
> From: r-devel-bounces_at_r-project.org
> [mailto:r-devel-bounces_at_r-project.org] On Behalf Of Benilton Carvalho
> Sent: Thursday, November 19, 2009 6:59 PM
> To: Steven McKinney
> Cc: 'mark_at_celos.net'; 'r-devel_at_stat.math.ethz.ch'
> Subject: Re: [Rd] Surprising length() of POSIXlt vector (PR#14073)
>
> Steve,
>
> I'm no expert on this, but my understanding is that the
> choice was to
> stick to the definition.
>
> The help file for length() [1] says:
>
> "For vectors (including lists) and factors the length is the
> number of
> elements."
>
> The help file for POSIXlt [2] (for example) says:
>
> "Class '"POSIXlt"' is a named list of vectors representing (...)"
>
> and then lists the 9 elements (sec / min / hour / mday / mon
> / year /
> wday / yday / isdst).
>
> So, by [1] length of POSIXlt objects is 9, because it "is a
> named list
> of vectors representing (...)".
>
> b

Before data.frames existed (c. 1991) the S help files probably would have described describe 'dim()' in a similar way for matrices, but it made sense to extend it and its help file to work on data.frames after they were invented. Aren't the real questions how much code would break, how much code would start working, and how easy or hard would it be for a user to make sense of it if length(POSIXlt.thing) reported how many dates were in POSIXlt.thing instead of reporting how many components were in its representation?

R's rep method for POSIXlt has a length argument that represents the number of dates, as it must. Its subscript operator for POSIXlt accepts an index in the range 1:numberOfDates. I.e., lots of its methods act like its length is the number of dates. However POSIXlt is not vector-like enough to make a matrix out of or to attach names to its dates.

I don't think a possibly out-of-date help file is the definitive answer to the question of whether or not there should be a length method for POSIXlt.

S+ has a timeDate class (represented as 2 vectors of integers and some scalar attributes) with a length method that gives the number of dates. I think the main problem with the method is that the C-level get_length function returns a different value than the SV4 method does.

S+ also has a numRows functions which is documented to to return the 'number of cases' in a data object, with methods for lots of classes (vector, matrix, timeSeries, data.frame, etc.). Users can call that and know it never represents some accident of implementation as length might. Then users could abandon the use of length in favor of numRows except when writing low-level code that deals with the representation of things. Does R has a similar high level function? (In that same family of functions S+ has rowIds, numCols, and colIds to supplant rownames, ncol, and colnames, respectively.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> On Nov 20, 2009, at 12:19 AM, Steven McKinney wrote:
>
> >
> > I've checked the archives, and this problem crops up every
> > few months going back for years.
> >
> > What I was not able to find was an explanation of why a
> > function such as
> > length.POSIXlt <- function(x) { length(x$sec) }
> > is a Bad Idea, or what it would break. listserv threads
> > seem to end without presenting an answer. R News 2001
> > Vol 1/2 discusses that "lots of methods are needed..."
> > (page 11) but I haven't found discussion of why a length
> > method isn't feasible.
> >
> > Can anyone clarify this, or point me at the right
> > archive or documentation source that discusses why
> > objects of class POSIXlt always need to return a
> > length of 9?
> >
> > Thanks
> > Steve McKinney
> >
> >
> >> -----Original Message-----
> >> From: r-devel-bounces_at_r-project.org [mailto:r-devel-bounces_at_r-
> >> project.org] On Behalf Of Benilton Carvalho
> >> Sent: Thursday, November 19, 2009 4:29 PM
> >> To: mark_at_celos.net
> >> Cc: r-devel_at_stat.math.ethz.ch
> >> Subject: Re: [Rd] Surprising length() of POSIXlt vector (PR#14073)
> >>
> >> Check the documentation and the archives. Not a bug. b
> >>
> >> On Nov 19, 2009, at 8:30 PM, mark_at_celos.net wrote:
> >>
> >>> Arrays of POSIXlt dates always return a length of 9. This
> >>> is correct (they're really lists of vectors of seconds,
> >>> hours, and so forth), but other methods disguise them as
> >>> flat vectors, giving superficially surprising behaviour:
> >>>
> >>> strings <- paste('2009-1-', 1:31, sep='')
> >>> dates <- strptime(strings, format="%Y-%m-%d")
> >>>
> >>> print(dates)
> >>> # [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04"
> >>> "2009-01-05"
> >>> # [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09"
> >>> "2009-01-10"
> >>> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14"
> >>> "2009-01-15"
> >>> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19"
> >>> "2009-01-20"
> >>> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24"
> >>> "2009-01-25"
> >>> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29"
> >>> "2009-01-30"
> >>> # [31] "2009-01-31"
> >>>
> >>> print(length(dates))
> >>> # [1] 9
> >>>
> >>> str(dates)
> >>> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03"
> >>> "2009-01-04" ...
> >>>
> >>> print(dates[20])
> >>> # [1] "2009-01-20"
> >>>
> >>> print(length(dates[20]))
> >>> # [1] 9
> >>>

> >>> I've since realised that POSIXct makes date vectors easier,
> >>> but could we also have something like:
> >>>
> >>> length.POSIXlt <- function(x) { length(x$sec) }
> >>>
> >>> in datetime.R, to avoid breaking functions (like the
> >>> str.POSIXt method) which use length() in this way?
> >>>
> >>> Thanks,
> >>> Mark <><
> >>>
> >>> ------
> >>>
> >>> Version:
> >>> platform = i686-pc-linux-gnu
> >>> arch = i686
> >>> os = linux-gnu
> >>> system = i686, linux-gnu
> >>> status =
> >>> major = 2
> >>> minor = 10.0
> >>> year = 2009
> >>> month = 10
> >>> day = 26
> >>> svn rev = 50208
> >>> language = R
> >>> version.string = R version 2.10.0 (2009-10-26)
> >>>
> >>> Locale:
> >>> C
> >>>
> >>> Search Path:
> >>> .GlobalEnv, package:stats, package:graphics, package:grDevices,
> >>> package:utils, package:datasets, package:methods, Autoloads,
> >>> package:base
> >>>
> >>> ______________________________________________
> >>> R-devel_at_r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >> ______________________________________________
> >> R-devel_at_r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 20 Nov 2009 - 05:41:50 GMT

This archive was generated by hypermail 2.2.0 : Fri 20 Nov 2009 - 08:10:33 GMT