Re: [R] Problem with ddply in the plyr-package: surprising output of a date-column

From: Brian Diggs <diggsb_at_ohsu.edu>
Date: Mon, 25 Apr 2011 14:14:19 -0700

On 4/25/2011 1:07 PM, Hadley Wickham wrote:
>> If you need plyr for other tasks you ought to use a different
>> class for your date data (or wait until plyr can deal with
>> POSIXlt objects).
>
> How do you get POSIXlt objects into a data frame?
>
>> df<- data.frame(x = as.POSIXlt(as.Date(c("2008-01-01"))))
>> str(df)
> 'data.frame': 1 obs. of 1 variable:
> $ x: POSIXct, format: "2008-01-01"
>
>> df<- data.frame(x = I(as.POSIXlt(as.Date(c("2008-01-01")))))
>> str(df)
> 'data.frame': 1 obs. of 1 variable:
> $ x: AsIs, format: "0"
>
> Hadley

Assigning to a column after the data.frame creation step

 > df <- data.frame(x = as.POSIXlt(as.Date(c("2008-01-01"))))  > str(df)
'data.frame': 1 obs. of 1 variable:
  $ x: POSIXct, format: "2008-01-01"
 > dput(df)
structure(list(x = structure(1199145600, class = c("POSIXct", "POSIXt"), tzone = "UTC")), .Names = "x", row.names = c(NA, -1L ), class = "data.frame")
 > df$x <- as.POSIXlt(as.Date(c("2008-01-01")))  > str(df)
'data.frame': 1 obs. of 1 variable:
  $ x: POSIXlt, format: "2008-01-01"
 > dput(df)
structure(list(x = structure(list(sec = 0, min = 0L, hour = 0L,

     mday = 1L, mon = 0L, year = 108L, wday = 2L, yday = 0L, isdst = 0L), .Names = c("sec",
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst" ), class = c("POSIXlt", "POSIXt"), tzone = "UTC")), .Names = "x", row.names = c(NA,
-1L), class = "data.frame")

This is reminiscent of the 1d array problem; there are types that are coerced into other types when passed as part of a data.frame constructor (data.frame call), but are not coerced when assigned to a column.

Looking at help pages, calls to data.frame call as.data.frame on each argument; `[<-.data.frame` has a section on coercion which starts "The story over when replacement values are coerced is a complicated one, and one that has changed during R's development. This section is a guide only." which makes me think it is not all that well defined.

Digging more, there is a as.data.frame.POSIXlt, although the help page for it (DateTimeClasses in base) does not mention it or document it. It is documented, though, in as.data.frame (which also has comments about coercing 1 dimensional arrays).

So, potentially, there could be differences with any class that has an as.data.frame method because it will be treated differently if passed to data.frame versus a column assignment with `[<-.data.frame`

 > methods("as.data.frame")
  [1] as.data.frame.aovproj*        as.data.frame.array
  [3] as.data.frame.AsIs            as.data.frame.character
  [5] as.data.frame.complex         as.data.frame.data.frame
  [7] as.data.frame.Date            as.data.frame.default
  [9] as.data.frame.difftime        as.data.frame.factor
[11] as.data.frame.ftable*         as.data.frame.function
[13] as.data.frame.idf*            as.data.frame.integer
[15] as.data.frame.list            as.data.frame.logical
[17] as.data.frame.logLik*         as.data.frame.matrix
[19] as.data.frame.model.matrix    as.data.frame.numeric
[21] as.data.frame.numeric_version as.data.frame.ordered
[23] as.data.frame.POSIXct         as.data.frame.POSIXlt
[25] as.data.frame.raw             as.data.frame.table
[27] as.data.frame.ts              as.data.frame.vector

So, I suppose it is working as documented. Though I wonder how long ago it was that someone (who has been using R regularly for at least a year) actually read the entire help page for data.frame and/or as.data.frame.   It's one of those things you think you know and understand until you find out you don't.

-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 25 Apr 2011 - 21:26:53 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Apr 2011 - 21:30:34 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive