[R] AIC in R

From: Pierre Duchesne <duchesne_at_dms.umontreal.ca>
Date: Thu 28 Sep 2006 - 21:45:54 GMT


Dear R users,

According Brockwell & Davis (1991, Section 9.3, p.304), the penalty term for computing the AIC criteria is "p+q+1" in the context of a zero-mean ARMA(p,q) time series model. They arrived at this criterion (with this particular penalty term) estimating the Kullback-Leibler discrepancy index. In practice, the user usually chooses the model whose estimated index is minimum. Consequently, it seems that the theory and the interpretation are only available in the case of a zero mean ARMA model, at least in the time series context.

Concerning R, it seems that the penalty term is p+q+1 in a zero mean model, and p+q+1+1 = p+q+2 for a ARMA(p,q) model with a constant term. See the following examples:



set.seed(1)
serieAR1 = arima.sim(100,model=list(ar= 0.5))
fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T) 
fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T) 
fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F) 
fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T) 
fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)

-2* fit1AR1$loglik + 2*(1+1)
fit1AR1$aic

-2* fit2AR1$loglik + 2*(1+1+1)
fit2AR1$aic

-2* fit3AR1$loglik + 2*(1+1)
fit3AR1$aic

-2* fit4AR1$loglik + 2*(1+1+1+1)
fit4AR1$aic

-2* fit5AR1$loglik + 2*(1+1+1)
fit5AR1$aic

> set.seed(1)
> serieAR1 = arima.sim(100,model=list(ar= 0.5))
>
> fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
> fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
> fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
> fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
> fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)
>
> -2* fit1AR1$loglik + 2*(1+1)

[1] 297.4670
> fit1AR1$aic

[1] 297.4670
>
> -2* fit2AR1$loglik + 2*(1+1+1)

[1] 270.5381
> fit2AR1$aic

[1] 270.5381
>
> -2* fit3AR1$loglik + 2*(1+1)

[1] 270.6653
> fit3AR1$aic

[1] 270.6653
>
> -2* fit4AR1$loglik + 2*(1+1+1+1)

[1] 272.3530
> fit4AR1$aic

[1] 272.3530
>
> -2* fit5AR1$loglik + 2*(1+1+1)

[1] 272.5564
> fit5AR1$aic

[1] 272.5564


>From the help file of extractAIC(), it seems that the criterion used is:
AIC = - 2*log L + k * edf,
where L is the likelihood and 'edf' the equivalent degrees of freedom (i.e., the number of free parameters for usual parametric models) of 'fit'.

My question is: is there any justification for computing the AIC as done by R when a constant term is in the model?

Your help will be appreciated.

Best regards,
Pierre

Note: for differenced time series (d > 1), the penalty term seems to be p+q+1, and there is no constant term in the fit.



Pierre Duchesne,
Département de mathématiques et statistique, Université de Montréal,
CP 6128 Succ. Centre-Ville,
Montréal, Québec, Canada H3C 3J7.

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri Sep 29 08:00:51 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 28 Sep 2006 - 22:30:07 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.