Re: [R] Imputing missing values in time series

From: Horace Tso <Horace.Tso_at_pgn.com>
Date: Fri, 22 Jun 2007 14:29:01 -0700

Thanks to Mark and Erik for different versions of locf, also Erik's pointer to archive where I found another function due to Simon Fear. I haven't tested the zoo locf function. The following shows their performance. Interestingly, Erik's use of a while loop is the fastest.

HT.

x = 1:1e5
x[sample(1:1e5, 10000)] = NA

>system.time(z2<-locf.iverson2(x))

   user system elapsed
   0.07 0.00 0.06
> system.time(z1<-locf.iverson(x))

   user system elapsed
   0.11 0.00 0.11
> system.time(z3<-locf.sfear(x))

   user system elapsed
   1.13 0.00 1.12



# Due to Erik Iverson
locf.iverson2 = function(x) {
  while(any(is.na(x))) {
    x[is.na(x)] <- x[which(is.na(x))-1]
  }
  x
}

# Due to Simon Fear (Fri Nov 14 17:28:57 2003) locf.sfear = function(x) {
  assign("stored.value", x[1], envir=.GlobalEnv)   sapply(x, function(x) {
    if(is.na(x))
      stored.value
    else {

      assign("stored.value", x, envir=.GlobalEnv) 
      x 

    }})
}

# Due to Erik Iverson
locf.iverson = function(x, unkn=-1) {
  x[is.na(x)] = unkn #something that is not a possible price   run = rle(x)
  run$values[run$values==unkn] = run$values[which(run$values==unkn)-1]   inverse.rle(run)
}

>>> "Horace Tso" <Horace.Tso_at_pgn.com> 6/22/2007 12:21 PM >>>
Mark, thanks for the tips. I thought you financial folks must have run into things like these before. Just wonder why this problem wasn't asked more often on this list.

H.

>>> "Leeds, Mark (IED)" <Mark.Leeds@morganstanley.com> 6/22/2007 12:16 PM >>>
I have a function that does this type of thing but it works off a pure vector so it wouldn have to be modified. If you make your object a zoo object, the that object has many functions associated with it and na.locf would
Do what you need, I think.

-----Original Message-----
From: r-help-bounces_at_stat.math.ethz.ch
[mailto:r-help-bounces_at_stat.math.ethz.ch] On Behalf Of Erik Iverson Sent: Friday, June 22, 2007 3:02 PM
To: Horace Tso
Cc: r-help_at_stat.math.ethz.ch
Subject: Re: [R] Imputing missing values in time series

I think my example should work for you, but I couldn't think of a way to do this without an interative while loop.

test <- c(1,2,3,NA,4,NA,NA,5,NA,6,7,NA)

while(any(is.na(test)))
test[is.na(test)] <- test[which(is.na(test))-1]

  test
  [1] 1 2 3 3 4 4 4 5 5 6 7 7

Horace Tso wrote:
> Folks,
>
> This must be a rather common problem with real life time series data
> but I don't see anything in the archive about how to deal with it. I
> have a time series of natural gas prices by flow date. Since gas is
> not traded on weekends and holidays, I have a lot of missing values,
>
> FDate Price
> 11/1/2006 6.28
> 11/2/2006 6.58
> 11/3/2006 6.586
> 11/4/2006 6.716
> 11/5/2006 NA
> 11/6/2006 NA
> 11/7/2006 6.262
> 11/8/2006 6.27
> 11/9/2006 6.696
> 11/10/2006 6.729
> 11/11/2006 6.487
> 11/12/2006 NA
> 11/13/2006 NA
> 11/14/2006 6.725
> 11/15/2006 6.844
> 11/16/2006 6.907
>
> What I would like to do is to fill the NAs with the price from the
> previous date * gas used during holidays is purchased from the week
> before. Though real simple, I wonder if there is a function to perform

> this task. Some of the imputation functions I'm aware of (eg. impute,
> transcan in Hmisc) seem to deal with completely different problems.
>
> 2.5.0/Windows XP
>
> Thanks in advance.
>
> HT
>
> ______________________________________________
> R-help_at_stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}



R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 22 Jun 2007 - 21:45:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Jun 2007 - 16:32:18 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.