# Re: [R] Collinearity in nls problem

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Fri 03 Mar 2006 - 12:58:07 EST

Trying different parameterizations is often a wise with nonlinear regression. However, I know of no general rule for finding a good one other than to try several and try to fit a paraboloid to the sums of squares surface in a region of the least squares solution: The best parameterization will be fairly close to parabolic. To do this, I've used "expand.grid" to get the points, then chop of all points with sums of squares exceeding the minimum plus some number that should represent, say, a joint 99% confidence region. I also supplement this with contour or perspective plots: Parameterizations with the better R^2's usually also have a more elliptical appearance in contour plots. I've done this successfully to find a parameterization that will both speed up estimation AND provide reasonable accuracy with Wald approximate confidence intervals.

Even without that, however, we can still get good, joint confidence regions in the form of contour plots of the sums of squares surface: The validity of these confidence regions is only affected by the intrinsic curvature of the problem, and is not affected by the parameterization. Of course, if we select a strange parameterization, our confidence regions will not look very elliptical (and our univariate confidence intervals may be far from symmetric).

My favorite reference for this kind of thing is Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley).

```	  hope this helps.
spencer graves

```

Simon Frost wrote:

> Dear R-Help list,
>
> I have a nonlinear least squares problem, which involves a changepoint;
> at the beginning, the outcome y is constant, and after a delay, t0, y
> follows a biexponential decay. I log-transform the data, to stabilize
> the error variance. At time t < t0, my model is
>
> log(y_i)=log(exp(a0)+exp(b0))
>
> at time t >= t0, the model is
>
> log(y_i)=log(exp(a0-a1*(t_i - t0))+exp(b0=b1*(t_i - t0)))
>
> I thought that I would have identifiability issues, but this model seems
> to work fine except that the parameters t0 (the delay) is highly
> correlated with the initial decay slope a0 (which makes sense, as the
> longer the delay, the more rapid the drop has to be, conditional on the
> data).
>
> To get over this problem, I could reparameterize the problem, but it
> isn't clear to me how to do this for the above model. I also thought
> about using a penalized least square approach, to shrink t0 and a1
> towards 0. I haven't seen much on penalized least squares in a nonlinear
> least squares setting; is this a good way to go? Can I justifiably
> penalize only a0 and a1, or should I also penalize the other parameters?
>
> Thanks for any help!
> Simon

R-help@stat.math.ethz.ch mailing list