Re: [Rd] The constant part of the log-likelihood in StructTS

From: Mark Leeds <markleeds2_at_gmail.com>
Date: Wed, 02 May 2012 11:36:37 -0400

Hi Ravi: As far as I know ( well , really read ) and Bert et al can say more , the AIC is not dependent on the models being nested as long as the sample sizes used are the same when comparing. In some cases, say comparing MA(2), AR(1), you have to be careful with sample size usage but there is no nesting requirement for AIC atleast, I'm pretty sure.

So, Jouni's worry I think should be the different likelihoods. Jouni: There are ways of re-writing ARIMA as STRUCta type models which might be easier than trying to consistentitize the likelihoods across different packages/base. StructTS is really a specific DLM as far as I understand it so you may be better off going to the DLM package. The DLM likelihoods still will not necessarily be consistent with arima likelihoods..But there are ways of transforming arimas so that they can be written as DLM's so that you can DLM for those also. My point is that, if you're comparing likelihoods of different models, if possible, it's best to use ONE package/function so that you don't use different likelihoods by accident.

Mark

Also, not sure why this is on R-dev ?

                                     Mark















On Wed, May 2, 2012 at 11:19 AM, Ravi Varadhan <rvaradhan_at_jhmi.edu> wrote:

> Comparing such disparate, non-nested models can be quite problematic. I
> am not sure what AIC/BIC comparisons mean in such cases. The issue of
> different constants should be the least of your worries.
>
> Ravi
>
> -----Original Message-----
> From: r-devel-bounces_at_r-project.org [mailto:r-devel-bounces_at_r-project.org]
> On Behalf Of Jouni Helske
> Sent: Tuesday, May 01, 2012 2:17 PM
> To: r-devel_at_r-project.org
> Subject: Re: [Rd] The constant part of the log-likelihood in StructTS
>
> Ok, it seems that R's AIC and BIC functions warn about different
> constants, so that's probably enough. The constants are not irrelevant
> though, if you compute the log-likelihood of one model using StructTS, and
> then fit alternative model using other functions such as arima(), which do
> take account the constant term, and use those loglikelihoods for computing
> for example BIC, you get wrong results when checking which model gives
> lower BIC value. Hadn't though about it before, have to be more careful in
> future when checking results from different packages etc.
>
> Jouni
>
>
> On Tue, May 1, 2012 at 4:51 PM, Ravi Varadhan <rvaradhan_at_jhmi.edu> wrote:
>
> > This is not a problem at all. The log likelihood function is a
> > function of the model parameters and the data, but it is defined up to
> > an additive arbitrary constant, i.e. L(\theta) and L(\theta) + k are
> > completely equivalent, for any k. This does not affect model
> > comparisons or hypothesis tests.
> >
> > Ravi
> > ________________________________________
> > From: r-devel-bounces_at_r-project.org [r-devel-bounces_at_r-project.org] on
> > behalf of Jouni Helske [jounihelske_at_gmail.com]
> > Sent: Monday, April 30, 2012 7:37 AM
> > To: r-devel_at_r-project.org
> > Subject: [Rd] The constant part of the log-likelihood in StructTS
> >
> > Dear all,
> >
> > I'd like to discuss about a possible bug in function StructTS of stats
> > package. It seems that the function returns wrong value of the
> > log-likelihood, as the added constant to the relevant part of the
> > log-likelihood is misspecified. Here is an simple example:
> >
> > > data(Nile)
> > > fit <- StructTS(Nile, type = "level") fit$loglik
> > [1] -367.5194
> >
> > When computing the log-likelihood with other packages such as KFAS and
> > FKF, the loglikelihood value is around -645.
> >
> > For the local level model, the likelihood is defined by
> > -0.5*n*log(2*pi) -
> > 0.5*sum(log(F_t) + v_t^2/sqrt(F_t)) (see for example Durbin and
> > Koopman (2001, page 30). But in StructTS, the likelihood is computed
> like this:
> >
> > loglik <- -length(y) * res$value + length(y) * log(2 * pi),
> >
> > where the first part coincides with the last part of the definition,
> > but the constant part has wrong sign and it is not multiplied by 0.5.
> > Also in case of missing observations, I think there should be
> > sum(!is.na(y)) instead of length(y) in the constant term, as the
> > likelihood is only computed for those y which are observed.
> >
> > This does not affect in estimation of model parameters, but it could
> > have effects in model comparison or some other cases.
> >
> > Is there some reason for this kind of constant, or is it just a bug?
> >
> > Best regards,
> >
> > Jouni Helske
> > PhD student in Statistics
> > University of Jyväskylä
> > Finland
> >
> > [[alternative HTML version deleted]]
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 02 May 2012 - 15:43:39 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 May 2012 - 21:50:55 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive