Re: [R] Time and db precision

From: Mikkel Grum <mi2kelgrum_at_yahoo.com>
Date: Thu, 26 May 2011 08:04:24 -0700 (PDT)


Thanks Marc,

I had just come up with another, slightly more convoluted solution. Add as.is = TRUE to the query and then get the timetoken with   timetoken <- df$timestamp[df$timestamp == max(as.POSIX(df$timestamp))]

While it looks like options(digits.secs = 6) works, I worry that theoretically it just pushes the problem down to another level of decimal points. With the solution above, I apparently get the exact same value that was in the database.

Interestingly,
  timetoken <- max(as.POSIX(df$timestamp)) does not appear to give me the same result.

Demo:
> a <- "2011-05-25 22:15:11.027116000"
> b <- "2011-05-25 22:15:11.027117000"
> c <- "2011-05-25 22:15:11.027118000"
> d <- c(a, b, c)
> d
[1] "2011-05-25 22:15:11.027116000" "2011-05-25 22:15:11.027117000" "2011-05-25 22:15:11.027118000"
> d[d == max(as.POSIXct(d))]

[1] "2011-05-25 22:15:11.027118000"
> max(as.POSIXct(d))

[1] "2011-05-25 22:15:11 COT"

> From: Marc Schwartz <marc_schwartz_at_me.com>
> Subject: Re: [R] Time and db precision
> To: "Mikkel Grum" <mi2kelgrum@yahoo.com>
> Cc: "R Help" <r-help_at_r-project.org>
> Date: Thursday, May 26, 2011, 8:22 AM
>
> On May 25, 2011, at 6:25 PM, Mikkel Grum wrote:
>
> > I have a loop that regularly checks for new data to
> analyse in my database. In order to facilitate this, the
> database table has a timestamp column with the time that the
> data was inserted into the database. Something like this:
> >
> > while (....) {
> >    load(timetoken.Rdata)
> >    df <- sqlQuery(con, paste("SELECT *
> FROM tabledf WHERE timestamp > ", timetoken, sep = ""))
> >    analyse(df)
> >    timetoken <- max(df$timestamp)
> >    save(timetoken, file =
> "timetoken.Rdata")
> >    Sys.sleep(60)
> > }
> >
> > Now this used to work fairly well with R and
> PostgreSQL on Windows, but on Ubuntu, I'm getting a lot of
> data being pulled up again and again. I suspect what is
> happening is that R is rounding off to the nearest second,
> while PostgreSQL is using a much higher level of precision,
> so that if no new data has come in in the meantime, chances
> are fairly high (50% ??) that the PostgreSQL timestamp is
> higher than the version that has been rounded off by R.
> >
> > Is there any way of recording the timestamp in R
> exactly as it is in PostgreSQL? Or is there another way of
> dealing with this??
> >
> > sessionInfo()
> > R version 2.11.1 (2010-05-31)
> > x86_64-pc-linux-gnu
> >
> > locale:
> > [1] C
> >
> > attached base packages:
> > [1] stats     graphics 
> grDevices utils     datasets 
> methods   base
> >
> > other attached packages:
> > [1] RODBC_1.3-1
> >
> > All assistance greatly appreciated.
> >
> > Mikkel
>
>
> This query is better suited for R-SIG-DB:
>
>   https://stat.ethz.ch/mailman/listinfo/r-sig-db
>
> That being said:
>
> See ?POSIXct
>
> Check the actual output of paste("SELECT * FROM tabledf
> WHERE timestamp > ", timetoken, sep = "") to see what
> value 'timetoken' is actually taking as it is used in the
> query construct. As is noted in the above help file, be sure
> that options("digits.secs") is properly set, since the
> default will be to round printed output to the nearest
> second:
>
> # A clean R session on OSX
> > options("digits.secs")
> $digits.secs
> NULL
>
> # return current date/time as POSIXct
>
> > Sys.time()
> [1] "2011-05-26 08:11:37 CDT"
>
> options(digits.secs = 6)
>
> > Sys.time()
> [1] "2011-05-26 08:12:07.080329 CDT"
>
>
> options(digits.secs = 0)
>
> > paste("SELECT * FROM tabledf WHERE timestamp > ",
> Sys.time(), sep = "")
> [1] "SELECT * FROM tabledf WHERE timestamp > 2011-05-26
> 08:15:02"
>
> options(digits.secs = 6)
>
> > paste("SELECT * FROM tabledf WHERE timestamp > ",
> Sys.time(), sep = "")
> [1] "SELECT * FROM tabledf WHERE timestamp > 2011-05-26
> 08:15:12.005103"
>
>
> HTH,
>
> Marc Schwartz
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 26 May 2011 - 15:15:39 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 26 May 2011 - 17:20:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive