# Re: [R] suggestions for nls error: false convergence

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Mon 19 Dec 2005 - 07:16:48 EST

I generally prefer "optim" to "nlminb", because it will optionally return the hessian, and a review of the eigenvalues and vectors can help diagnose problems like this. When I tried "optim" with your "func", I got an error message:

> est. <- optim( c(277, 100,101, 10), func,

```+                hessian=TRUE)
Error in optim(c(277, 100, 101, 10), func, hessian = TRUE) :
objective function in optim evaluates to length 100 not 1

For both "optim" and "nlminb", the second argument is a "function to
```
be minimized", which must return a scalar. Your "func" returned a vector of length 100 = length(x); "nlminb" tried to minimize only the first component of that vector without reference to y!

I therefore modified your "func" as follows and tried it with "nlminb" and with three methods of "optim":

func1 <- function( par,y, x ) {

```a = par
m = par
n = par
```

tau = par
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) sum((y-y.)^2)
}

est.n <- nlminb( c(277, 100,101, 10), objective=func1, control=list(eval.max=400, iter.max=1000), y=y, x=x)

est.o1 <- optim( c(277, 100,101, 10), func1,

```                hessian=TRUE, y=y, x=x)
est.o2 <- optim( c(277, 100,101, 10), func1,
method="BFGS", hessian=TRUE, y=y, x=x)
est.o3 <- optim( c(277, 100,101, 10), func1,
method="CG", hessian=TRUE, y=y, x=x)
est.o4 <- optim( c(277, 100,101, 10), func1,
method="SANN", hessian=TRUE, y=y, x=x)

These 5 optimizers found the following "minima", returning
```
"convergence" messages as follows:

nlminb: \$objective=808.7282; \$convergence=0, "relative convergence (4)"

optim(method="Nelder-Mead"(default)): \$value=9032.814 (11 times nlminb); \$convergence=1 (the iteration limit 'maxit' had been reached.)

optim(method="BFGS"): \$value=1189625 (1471 times nlminb); \$convergence=0 ("successful convergence")

optim(method="CG"): \$value=1189574 (1471 times nlminb); \$convergence=1 (the iteration limit 'maxit' had been reached.)

optim(method="SANN"): \$value=42819.06 (53 times nlminb); \$convergence=0 ("successful convergence")

Clearly, nlminb got the smallest residual sum of squares. To get the hessian there, I fed the nlminb output into optim as follows:

est.no1 <- optim(est.n\$par, func1,

```                hessian=TRUE, y=y, x=x)
```

est.no.hess.eig <- eigen(est.no1\$hessian, symmetric=TRUE) signif(est.no.hess.eig\$values, 2)
 6.9e+05 3.2e+01 1.6e-04 -1.0e-07

Since the smallest eigenvalue is negative, we might think that the hessian is indefinite. However, the smallest eigenvalue is less than 1e-12 times the largest in absolute value, which really means that the smallest eigenvalue is essentially zero. Moreover, since the third eigenvalue is less than 1e-9 times the largest, the true rank of the hessian is at most 2. If we also compare the first and second eigenvalues, we that the first is over 2000 times the second. This says that with your model and your data, you really can only estimate one parameter; everything else looks like noise, to me.

The eigenvectors provide more detail:

> round(est.no.hess.eig\$vectors, 2)

[,1] [,2] [,3] [,4]

```[1,] -0.01 1.00  0.00 0.00
[2,]  0.00 0.00  1.00 0.01
[3,]  0.00 0.00 -0.01 1.00
[4,]  1.00 0.01  0.00 0.00
>
This says that your data can estimate the fourth parameter very well,
```
as the largest eigenvalue is associated almost exclusively with that parameter. You might also be able to estimate the second parameter somewhat, as it is associated almost exclusively with the second eigenvalue. However, you have to move the second and third parameters much more than the first and fourth to see much change in the residual sum of squares.

Seber and Wild (1988) Nonlinear Regression (Wiley; reprinted in 2003, esp. ch. 7 on "Growth Models")

Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley, esp. ch. 6 on "Graphical Summaries")

```	  hope this helps.
spencer graves

```

Rajarshi Guha wrote:

```> Hi,
>   I'm trying to fit some data using a logistic function defined as
>
> y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)
>
> My data is below:
>
> x <- 1:100
>
> y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
> 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
> 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
> 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
> 268,270,272,272,272,273,275,275,275,276)
>
> My first attempt was to use nls as below:
>
>     d <- data.frame(x=x, y=y)
>     model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d,
>     start=list(a=277,m=100,n=101,tau=10),
>     algorithm='port', trace=TRUE,
>     control=nls.control(maxiter=5000, minFactor=1/2048))
>
> Running the above code I get the following error message:
>
> Convergence failure: function evaluation limit reached without
> convergence (9).
>
> To investigate this further I used nlminb() to get a set of starting
> parameters. Thus I did:
>
> func <- function( par ) {
>     a = par
>     m = par
>     n = par
>     tau = par
>     a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
> }
> est <- nlminb( c(277, 100,101, 10), objective=func,
> control=list(eval.max=400, iter.max=1000))
>
> I get absolute convergence and a set of parameter values. Plugging these
> into the nls call and trying again still gives me
>
>  Convergence failure: function evaluation limit reached without
> convergence (9)
>
> I have tried a number of different starting values for the nls() call
> but I usually end up getting the following error:
>
> Convergence failure: false convergence (8)
>
> After reading the PORT library docs, I see that this error can mean
>
> 1) gradient is calculated incorrectly
> 2) stopping tolerances are too tight
> 3) gradient is discontinous near some iterate
>
> However, since nls() usually reports the above error after 30 to 40
> iterations, the PORT docs suggest that it is not problem 1. I'm not sure
> about (3) - I have other data which are somewhat similar to the above
> data, but they lead to a straightforward fit.
>
>
>
> In the end I tried a different starting value and lowered the tolerances
> a little, and I got a valid fit
>
> My questions are:
>
> 1) Why would the parameters that lead nlminb() to converge to work in
> nls() (since I'm using the PORT algorithm in both cases)?
>
> 2) Is there a strategy to obtain starting values? In my case I know that
> a should be around 277, but for the others I'm not sure.
>
> 3) Is there a quick way to check whether the gradient is discontinous at
> a point, numerically (rather than calculating the analytical
> derivatives)? I did
>
> plot(diff(y))
>
> and it certainly looks messy, but I also have other y vectors which look
> equally jagged (though the jaggedness occurs at lower x)
>
> Any suggestions would be appreciated.
>
> Thanks,
>
> -------------------------------------------------------------------
> Rajarshi Guha <rxg218_at_psu.edu> <http://jijo.cjb.net>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> After a number of decimal places, nobody gives a damn.
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help

```
```--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

spencer.graves@pdf.com
www.pdf.com <http://www.pdf.com>
Tel:  408-938-4420
Fax: 408-280-7915

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help