[Rd] The C function getQ0 returns a non-positive covariance matrix and causes errors in arima()

From: Raphael Rossignol <raphael.rossignol_at_math.u-psud.fr>
Date: Wed, 20 Jul 2011 13:54:02 +0200


the function makeARIMA(), designed to construct some state space representation of an ARIMA model, uses a C function called getQ0, which can be found at the end of arima.c in R source files (library stats). getQ0 takes two arguments, phi and theta, and returns the covariance matrix of the state prediction error at time zero. The reference for getQ0 (cited by help(arima)) is: Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980) Algorithm

      AS154. An algorithm for exact maximum likelihood estimation of
      autoregressive-moving average models by means of Kalman filtering.
      _Applied Statistics_ *29*, 311-322.
where it is called subroutine STARMA (and coded in fortran 77).

My problem is that getQ0 returns incorrect covariance matrices in certain cases. Indeed, below is an example of a SARIMA(1,0,1)x(1,0,0)_12 where getQ0 gives a covariance matrix which possess negative eigenvalues ! Below, I obtain getQ0 results through makeARIMA().

> s <- 12
> phis <- 0.95
> phi1 <- 0.0001
> phi <- c(phi1,rep(0,s-2),phis,-phi1*phis)
> theta <- 0.7
> out <- makeARIMA(phi,theta,NULL)
> min(eigen(out$Pn)$value)

[1] -19.15890

There are consequences of this "bug" in the functions KalmanLike() and arima(). Indeed, arima() in its default behaviour uses first CSS method to get the initial value for an MLE search through optim. To compute the likelihood, it uses getQ0 at the initialization of the Kalman Filter. It may happen that getQ0 returns a covariance matrix which possesses negative eigenvalues and that this gives a negative gain in the Kalman filter, which in turn gives a likelihood equal to NaN. Here is a reproducible example where I forced the use of 'ML'.

> set.seed(1)
> x <- arima.sim(100,model=list(ar=phi,ma=theta))
> KalmanLike(x,mod=out,fast=FALSE)





> arima(x,order=c(1,0,1),seasonal=list(period=12,order=c(1,0,0)),include.mean=FALSE,init=c(phi1,theta,phis),method='ML') Erreur dans optim(init[mask], armafn, method = optim.method, hessian = TRUE, :
   valeur non-finie fournie par optim

If needed, I can send a more natural example in which the above behaviour is obtained. This error message ("Error in optim ... non-finite finite-difference value") was already noted in the following message, which remained without answer: https://stat.ethz.ch/pipermail/r-devel/2009-February/052009.html

I could not figure out whether there is a real bug in getQ0 or if this is due to some numerical instability. However, I tried to replace getQ0 in two ways. The first one is to compute first the covariance matrix of (X_{t-1},...,X_{t-p},Z_t,...,Z_{t-q}) and this is achieved through the method of difference equations (page 93 of Brockwell and Davis). This way was apparently suggested by a referee to Gardner et al. paper (see page 314 of their paper). Q0bis <- function(phi,theta){

   ## Computes the initial covariance matrix for the state space representation of Gardner et al.

   p <- length(phi)
   q <- length(theta)
   r <- max(p,q+1)

   ttheta <- c(1,theta,rep(0,max(p,q+1)-q-1))    A1 <- matrix(0,r,p)
   B <- (col(A1)+row(A1)<=p+1)
   C <- (col(A1)+row(A1)-1)
   A1[B] <- phi[C[B]]
   A2 <- matrix(0,r,q+1)
   C <- (col(A2)+row(A2)-1)
   B <- (C<=q+1)
   A2[B] <- ttheta[C[B]]
   A <- cbind(A1,A2)
   if (p==0) {
     S <- diag(q+1)
   else {
     ## Compute the autocovariance function of U, the AR part of X
     r2 <- max(p+q,p+1)
     tphi <- c(1,-phi)
     C1 <- matrix(0,r2,r2)
     F <- row(C1)-col(C1)+1
     E <- (1<=F)&(F<=p+1)
     C1[E] <- tphi[F[E]]
     C2 <- matrix(0,r2,r2)
     F <- col(C2)+row(C2)-1
     E <- (F<=p+1) & col(C2)>=2
     C2[E] <- tphi[F[E]]
     Gam <- (C1+C2)
     g <- matrix(0,r2,1)
     g[1] <- 1
     rU <- solve(Gam,g)
     SU <- toeplitz(rU[1:(p+q),1])
     ## End of the difference equations method
     ## Then, compute correlation matrix of X
     A2 <- matrix(0,p,p+q)
     C <- col(A2)-row(A2)+1
     B <- (1<=C)&(C<=q+1)
     A2[B] <- ttheta[C[B]]
     SX <- A2%*%SU%*%t(A2)
     ## Now, compute correlation matrix between X and Z
     C1 <- matrix(0,q,q)
     F <- row(C1)-col(C1)+1
     E <- (F>=1) & (F<=p+1)
     C1[E] <- tphi[F[E]]
     g <- matrix(0,q,1)
     if (q) g[1:q,1] <- ttheta[1:q]
     rXZ <- forwardsolve(C1,g)
     SXZ <- matrix(0, p, q+1)
     F <- col(SXZ)-row(SXZ)
     E <- F>=1
     SXZ[E] <- rXZ[F[E]]
     S <- rbind(cbind(SX,SXZ),cbind(t(SXZ),diag(q+1)))
   Q0 <- A%*%S%*%t(A)

The second way is to resolve brutally the equation of Gardner et al. in the form (12), page 314 of their paper.

Q0ter <- function(phi,theta){

   p <- length(phi)
   q <- length(theta)
   r <- max(p,q+1)
   T <- matrix(0,r,r)

   if (p) T[1:p,1] <- phi
   if (r) T[1:(r-1),2:r] <- diag(r-1)
   V <- matrix(0,r,r)
   ttheta <- c(1,theta)

   V[1:(q+1),1:(q+1)] <- ttheta%x%t(ttheta)    V <- matrix(V,ncol=1)
   S <- diag(r*r)-T%x%T
   Q0 <- solve(S,V)
   Q0 <- matrix(Q0,ncol=r)

My conclusion for now is that these two other ways give the same answers (as returned by all.equal) and sometimes different answers than getQ0. It may happen that they give a Q0 with negative eigenvalues, but they are very very small, and then, the likelihood computed by KalmanLike is a number (and not NaN). Here is a function allowing to compare the three methods
test <- function(phi,theta){

   out <- makeARIMA(phi,theta,NULL)
   Q0bis <- Q0bis(phi,theta)
   Q0ter <- Q0ter(phi,theta)
   mod <- out
   modbis <- out
   modter <- out
   modbis$Pn <- Q0bis
   modter$Pn <- Q0ter
   x <- arima.sim(100,model=list(ar=phi,ma=theta))    s <- KalmanLike(x,mod=mod,fast=FALSE)    sbis <- KalmanLike(x,modbis)
   ster <- KalmanLike(x,modter)

   test12 <- all.equal(out$Pn,Q0bis)
   test13 <- all.equal(out$Pn,Q0ter)
   test23 <- all.equal(Q0bis,Q0ter)

list(eigen=min(eigen(out$Pn)$value),eigenbis=min(eigen(Q0bis)$value),eigenter=min(eigen(Q0ter)$value),test12=test12,test13=test13,test23=test23,s=s,sbis=sbis,ster=ster) }

And the test on the values of phi and theta above: > test(phi,theta)

[1] -19.15890


[1] -9.680598e-23


[1] 1.859864e-23


[1] "Mean relative difference: 0.4255719"


[1] "Mean relative difference: 0.4255724"


[1] TRUE












Here are my questions:
1) Does someone understand this strange behaviour of Q0 ? 2) Should I report this as a bug ?

By the way, Q0bis is only twice slower than makeARIMA() (but it is not optimised), and Q0ter is 50 times slower than Q0bis.

For information:
> sessionInfo()

R version 2.10.1 (2009-12-14)


  [1] LC_CTYPE=fr_FR.utf8       LC_NUMERIC=C
  [3] LC_TIME=fr_FR.utf8        LC_COLLATE=fr_FR.utf8
  [5] LC_MONETARY=C             LC_MESSAGES=fr_FR.utf8
  [7] LC_PAPER=fr_FR.utf8       LC_NAME=C
  [9] LC_ADDRESS=C              LC_TELEPHONE=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base


Raphael Rossignol

Assistant Professor of Mathematics
Univ. Paris-Sud 11

R-devel_at_r-project.org mailing list
Received on Wed 20 Jul 2011 - 16:08:57 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 20 Jul 2011 - 16:30:12 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive