# [R] AIC in R

From: Pierre Duchesne <duchesne_at_dms.umontreal.ca>
Date: Thu 28 Sep 2006 - 21:45:54 GMT

Dear R users,

According Brockwell & Davis (1991, Section 9.3, p.304), the penalty term for computing the AIC criteria is "p+q+1" in the context of a zero-mean ARMA(p,q) time series model. They arrived at this criterion (with this particular penalty term) estimating the Kullback-Leibler discrepancy index. In practice, the user usually chooses the model whose estimated index is minimum. Consequently, it seems that the theory and the interpretation are only available in the case of a zero mean ARMA model, at least in the time series context.

Concerning R, it seems that the penalty term is p+q+1 in a zero mean model, and p+q+1+1 = p+q+2 for a ARMA(p,q) model with a constant term. See the following examples:

set.seed(1)
serieAR1 = arima.sim(100,model=list(ar= 0.5))
```fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)

```

-2* fit1AR1\$loglik + 2*(1+1)
fit1AR1\$aic

-2* fit2AR1\$loglik + 2*(1+1+1)
fit2AR1\$aic

-2* fit3AR1\$loglik + 2*(1+1)
fit3AR1\$aic

-2* fit4AR1\$loglik + 2*(1+1+1+1)
fit4AR1\$aic

-2* fit5AR1\$loglik + 2*(1+1+1)
fit5AR1\$aic

> set.seed(1)
> serieAR1 = arima.sim(100,model=list(ar= 0.5))
>
> fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
> fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
> fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
> fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
> fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)
>
> -2* fit1AR1\$loglik + 2*(1+1)

 297.4670
> fit1AR1\$aic

 297.4670
>
> -2* fit2AR1\$loglik + 2*(1+1+1)

 270.5381
> fit2AR1\$aic

 270.5381
>
> -2* fit3AR1\$loglik + 2*(1+1)

 270.6653
> fit3AR1\$aic

 270.6653
>
> -2* fit4AR1\$loglik + 2*(1+1+1+1)

 272.3530
> fit4AR1\$aic

 272.3530
>
> -2* fit5AR1\$loglik + 2*(1+1+1)

 272.5564
> fit5AR1\$aic

 272.5564

>From the help file of extractAIC(), it seems that the criterion used is:
AIC = - 2*log L + k * edf,
where L is the likelihood and 'edf' the equivalent degrees of freedom (i.e., the number of free parameters for usual parametric models) of 'fit'.

My question is: is there any justification for computing the AIC as done by R when a constant term is in the model?

Best regards,
Pierre

Note: for differenced time series (d > 1), the penalty term seems to be p+q+1, and there is no constant term in the fit.

Pierre Duchesne,
Département de mathématiques et statistique, Université de Montréal,
CP 6128 Succ. Centre-Ville,