Re: [R] Generation of missiing values in a time serie...

From: Kjetil Brinchmann Halvorsen <kjetilbrinchmannhalvorsen_at_gmail.com>
Date: Wed 14 Dec 2005 - 09:41:00 EST

Gabor Grothendieck wrote:

> Yes, this is the definition of a time series and therefore of a zoo object.
> A time series is a mathematical function, i.e. it assigns a single element
> of the range to each element of the domain. This data does not describe
> a time series.

Since nobody else has mentiones it on this thread: Tha CRAN package pastecs has function `regul' to regularize irregular time series.

maybe that is what the original poster want.

Kjetil

> 
> Also it has no underlying regularity as the warning message states.
> To use as.ts one wants a series with an underlying regularity that has
> gaps and then as.ts will fill in the gaps with NAs.
> 
> If we don't have an underlying regularity the question is not well posed
> but its likely we want to discretize time.  The  zoo command itself is
> somewhat forgiving, at least in this case, i.e. it allows one to specify
> this illegal zoo object with non-unique times for purposes of discretization;
> however, such a zoo object should not be used other than to get a legal
> zoo object out.
> 
> For example, in the following we round the times to one decimal place
> and then within each set of values at the same discretized time take the
> last one.  (Alternately specify mean instead of tail, 1 if the average
> is prefered.)  Then we convert that to a ts object:
> 

>> as.ts(aggregate(z, round(time(z), 1), tail, 1))
> Time Series:
> Start = c(123, 2)
> End = c(123, 8)
> Frequency = 10
>           time flow seq       ts     x      rtt size
> 123.1 123.1257    0 967 123.1257 13394 0.798205 1472
> 123.2 123.2411    0 969 123.2411 12680 0.796258 1472
> 123.3       NA   NA  NA       NA    NA       NA   NA
> 123.4       NA   NA  NA       NA    NA       NA   NA
> 123.5 123.4726    0 970 123.4726 12680 0.796258 1472
> 123.6 123.5886    0 971 123.5886 12680 0.796258 1472
> 123.7 123.7046    0 972 123.7046 12680 0.796258 1472
> 
> On 12/13/05, Alvaro Saurin <saurin@dcs.gla.ac.uk> wrote:

>> I think I have found the error. It appears when there are two entries
>> with the same time. Using as input file:
>>
>> --------- CUT --------
>> # Output format for PCKs:
>> # TIME FLOW P [+-] SEQ TS X RTT SIZE
>> #
>> 123.125683 0 P + 967 123.125683 13394 0.798205 1472
>> 123.241137 0 P + 968 123.241137 12680 0.796258 1472
>> 123.241137 0 P + 969 123.241137 12680 0.796258 1472
>> 123.472631 0 P + 970 123.472631 12680 0.796258 1472
>> 123.588613 0 P + 971 123.588613 12680 0.796258 1472
>> 123.704594 0 P + 972 123.704594 12680 0.796258 1472
>> --------- CUT --------
>>
>> I run fhe following code:
>>
>> --------- CUT --------
>> h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
>> h_names <- list ("time", "flow", "seq", "ts", "x", "rtt", "size")
>>
>> pcks_file <- pipe ("grep ' P ' data", "r")
>> pcks <- scan (pcks_file, what = h_types, comment.char = '#',
>> fill = TRUE)
>> mat_df <- data.frame (pcks[1:2], pcks[5:9])
>> mat <- as.matrix (mat_df)
>> colnames (mat) <- h_names
>> z <- zoo (mat, mat [,"time"])
>> --------- CUT --------
>>
>> The dput of 'z' shows:
>>
>> --------- CUT --------
>> structure(c(123.125683, 123.241137, 123.241137, 123.472631, 123.588613,
>> 123.704594, 0, 0, 0, 0, 0, 0, 967, 968, 969, 970, 971, 972, 123.125683,
>> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594, 13394,
>> 12680, 12680, 12680, 12680, 12680, 0.798205, 0.796258, 0.796258,
>> 0.796258, 0.796258, 0.796258, 1472, 1472, 1472, 1472, 1472, 1472
>> ), .Dim = c(6, 7), .Dimnames = list(c("1", "2", "3", "4", "5",
>> "6"), c("time", "flow", "seq", "ts", "x", "rtt", "size")), index =
>> structure(c(123.125683,
>> 123.241137, 123.241137, 123.472631, 123.588613, 123.704594), .Names =
>> c("1",
>> "2", "3", "4", "5", "6")), class = "zoo")
>> --------- CUT --------
>>
>> If I try a 'as.ts(z)', it fails. If I remove the duplicate entry, I
>> can convert it to a TS with no problem. Is this made intentionally?
>> Because then I have to filter the input matrix... But, anyway, the
>> output matrix, after filtering, doesn't seem regular:
>>
>> --------- CUT --------
>> > as.ts (z)
>> Time Series:
>> Start = 1
>> End = 5
>> Frequency = 1
>> time flow seq ts x rtt size
>> 1 123.1257 0 967 123.1257 13394 0.798205 1472
>> 2 123.2411 0 969 123.2411 12680 0.796258 1472
>> 3 123.4726 0 970 123.4726 12680 0.796258 1472
>> 4 123.5886 0 971 123.5886 12680 0.796258 1472
>> 5 123.7046 0 972 123.7046 12680 0.796258 1472
>> Warning message:
>> 'x' does not have an underlying regularity in: as.ts.zoo(z)
>> --------- CUT --------
>>
>> Weird...
>>
>>
>> On 13 Dec 2005, at 16:33, Gabor Grothendieck wrote:
>>
>>> Please provide a reproducible example. Note that dput(x) will output
>>> an R object in a way that can be copied and pasted into another
>>> session.
>>>
>>> On 12/13/05, Alvaro Saurin <saurin@dcs.gla.ac.uk> wrote:
>>>> On 13 Dec 2005, at 13:08, Gabor Grothendieck wrote:
>>>>
>>>>> Your variable mat is not a matrix; its a data frame. Check it with:
>>>>>
>>>>> class(mat)
>>>>>
>>>>> Here is an example:
>>>>>
>>>>> x <- cbind(A = 1:4, B = 5:8)
>>>>> tt <- c(1, 3:4, 6)
>>>>>
>>>>> library(zoo)
>>>>> x.zoo <- zoo(x, tt)
>>>>> x.ts <- as.ts(x.zoo)
>>>> Fixed, but anyway it fails:
>>>>
>>>>> h_types <- list (0, 0, NULL, NULL, 0, 0, 0, 0, 0)
>>>>> h_names <- list ("time", "flow", "seq", "ts", "x", "rtt",
>>>>> "size")
>>>>> pcks_file <- pipe ("grep ' P ' server.dat", "r")
>>>>> pcks <- scan (pcks_file, what = h_types,
>>>> comment.char = '#', fill =
>>>> TRUE)
>>>>
>>>>> mat_df <- data.frame (pcks[1:2], pcks[5:9])
>>>>> mat <- as.matrix (mat_df)
>>>>> colnames (mat) <- h_names
>>>>> class (mat)
>>>> [1] "matrix"
>>>>
>>>>> z <- zoo (mat, mat [,"time"])
>>>>> z
>>>>> z
>>>> time flow seq ts
>>>> x rtt size
>>>> 1.0009 1.000893 0.000000 0.000000 1.000893
>>>> 1472.000000 0.000000 1472.000000
>>>> 1.5145 1.514454 0.000000 1.000000 1.514454
>>>> 2944.000000 0.513142 1472.000000
>>>> 2.0151 2.015093 0.000000 2.000000 2.015093
>>>> 2944.000000 0.513142 1472.000000
>>>> 2.515 2.515025 0.000000 3.000000 2.515025
>>>> 4806.000000 0.504488 1472.000000
>>>> 2.822 2.821976 0.000000 4.000000 2.821976
>>>> 5730.000000 0.496728 1472.000000
>>>> [...]
>>>>
>>>>> as.ts (z)
>>>> Error in if (del == 0 && to == 0) return(to) :
>>>> missing value where TRUE/FALSE needed
>>>>
>>>> Any idea? Thanks for your help.
>>>>
>>>> Alvaro
>>>>
>>>>
>>>> --
>>>> Alvaro Saurin <alvaro.saurin@gmail.com> <saurin@dcs.gla.ac.uk>
>>>>
>>>>
>>>>
>>>>
>> --
>> Alvaro Saurin <alvaro.saurin@gmail.com> <saurin@dcs.gla.ac.uk>
>>
>>
>>
>>
> 
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Dec 14 10:38:23 2005

This archive was generated by hypermail 2.1.8 : Wed 14 Dec 2005 - 14:38:36 EST