[R] help with ARIMA and predict

From: Brian Scholl <brianscholl1973_at_yahoo.com>
Date: Sat 09 Jul 2005 - 09:06:56 EST


I'm trying to do the following out of sample regression with autoregressive terms and additional x variables:

y(t+1)=const+B(L)*y(t)+C(1)*x_1(t)...+C(K)*x_K(t)

where:
B(L) = lag polynom. for AR terms
C(1..K) = are the coeffs. on K exogenous variables that have only 1 lag

Question 1:


Suppose I use arima to fit the model:

df.y<-arima(yvec,order=c(L,0,0),xreg=xmat[,(1:K)],n.cond=maximum.lag)

Now suppose I want to do a 1-period ahead prediction based on the results of this regression, using predict:

predict(df.y,newxreg=newx,n.ahead=1)

I'm expecting newx to be 1X3. After all, I just want to predict 1 value of y, so in my mind I should just need 1 time period's observation of x (i.e. # rows=n.ahead). I'm sort of expecting predict to grab the last two values of yvec to use as y(t),y(t-1) in prediction. If I make such a pass, I get:

Error in predict.Arima(df.y, newxreg = newx) : 'xreg' and 'newxreg' have different numbers of columns

If I try passing 2+ rows of x, predict accepts the call and I get:

Time Series:
Start = 41
End = 42
Frequency = 1
[1] -0.03165 -0.03165 (for simplicity I passed two
identical rows of x)

$se
Time Series:
Start = 41
End = 41
Frequency = 1
[1] 0.02707

So I'm puzzled as to what I'm doing wrong. When I have n.ahead rows in newxreg, I get an error, but by passing a second row in it is accepted. But what am I predicting in the latter case? Is R requiring another row so that it can form a prediction of y(t) to use in forecasting y(t+1) (this is not what I want to do), or have I simply goofed in some other way?

Is there a better way to do this? I've also attempted something similar using lm, but I'm unclear how to interpret the "predicted" time series it returns. The obvious alternative is to construct the forecast using df.y$coef and a relevant data vector.

Q2:

---

Suppose I want to select the autoregressive order
using AIC.  If I have understood, in the excellent
MASS text comments (p415) that comparisons are only
valid if n.cond is the same for each model.  Yet, when
I set n.cond=maximum.lag (say =5), I get df.y$n.cond
=0.  So I'm unclear if the AICs are comparable for
different models (i.e. different L's and different
K's).

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Sat Jul 09 09:11:18 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:26 EST