[Rd] Arima_Like() and NaN - a (possible) problem, a patch, and RFC

From: Mauro Andreolini <mauro.andreolini_at_unimore.it>
Date: Mon, 02 Feb 2009 12:29:13 +0100


Hi,

recently I have started working with R (v. 2.7.2), and I have been using R's internal ARIMA_Like() function (from the "stats" package) to estimate some ARIMA models. In particular, I use ARIMA_Like() in a function "fn()" that I feed to the optim() method; the main goal is to find optimal ARIMA prediction models for some time series. The ARIMA_Like() function returns a three elements vector; under some conditions (that I could not yet spot), the second element of this vector is a 'NaN'. Since fn() is using this value to compute its return value, it suddenly returns 'NaN' and optim() warns me about it:

Error in optim(init[mask], armafn, method = "BFGS", hessian = TRUE, control = optim.control, :
  non-finite finite-difference value [2]

I looked into the code (file src/arima.c of the stats package) and noticed that this second element is a sum of logarithmic terms, computed through the following snippet of code:

gain = M[0];
for (j = 0; j < d; j++) gain += delta[j] * M[r + j]; if(gain < 1e4) {

    nu++;
    ssq += resid * resid / gain;
    sumlog += log(gain);
}

Here, sumlog is the second element of the resulting vector. However, the "if(gain < 1e4) {" check does not explicitly check against negative values of the gain variable. Indeed, whenever the gain variable assumes a negative value, the statement "sumlog += log(gain);" evalutes to NaN. I changed the check as follows:

if (gain > 0 && gain < 1e4) {

This avoids computation of logarithms on negative values. I recompiled and reinstalled R, and the sumlog value is no more 'NaN'. As a result, optim() never warns about the non-finite finite-difference value.

Here is my question: does this modification make any sense? Have I missed something big? To me, it looks reasonable to avoid computing log(x) when x < 0, but maybe returning 'NaN' may have its purposes. Could someone please clarify this? I searched the mailing list archives and I could not spot anything even close to this argument, which may be an indication that I am doing something really wrong, but I would like to understand why.

Best regards
Mauro Andreolini



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Mon 02 Feb 2009 - 12:28:34 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 02 Feb 2009 - 12:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive