# [R] Understanding nonlinear optimization and Rosenbrock's banana valley function?

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Mon 05 Dec 2005 - 07:42:30 EST

ROSENBROCK'S BANANA VALLEY FUNCTION?           Beyond this, I wonder if someone help me understand the lessons one should take from Rosenbrock's banana valley function:

banana <- function(x){

100*(x[2]-x[1]^2)^2+(1-x[1])^2
}

This a quartic x[1] and a parabola in x[2] with a unique minimum at x[2]=x[1]=1. Over the range (-1, 2)x(-1,1), it looks like a long, curved, deep, narrow banana-shaped valley. It is a known hard problem in nonlinear regression, but these difficulties don't affect "nlm" or "nlminb" until the hessian is provided analytically (with R 2.2.0 under Windows XP):

nlm(banana, c(-1.2, 1)) # found the minimum in 23 iterations nlminb(c(-1.2, 1), banana)# found the min in 35 iterations

Dbanana <- function(x){

c(-400*x[1]*(x[2] - x[1]^2) - 2*(1-x[1]),      200*(x[2] - x[1]^2))
}
banana1 <- function(x){

b <- 100*(x[2]-x[1]^2)^2+(1-x[1])^2
b
}

nlm(banana1, c(-1.2, 1)) # solved the problem in 24 iterations nlminb(c(-1.2, 1), banana, Dbanana)# solution in 35 iterations

D2banana <- function(x){

```         a11 <- (2 - 400*(x[2] - x[1]^2) + 800*x[2]*x[1]^2)
a21 <- (-400*x[1])
matrix(c(a11,a21,a21,200),2,2)
```

}
banana2 <- function(x){
```   b <- 100*(x[2]-x[1]^2)^2+(1-x[1])^2
attr(b, "hessian") <- D2banana(x)
```

b
}

nlm(banana2, c(-1.2, 1))
# Found the valley but not the minimum
# in the default 100 iterations.

nlm(banana2, c(-1.2, 1), iterlim=10000)
# found the minimum to 3 significant digits in 5017 iterations.

nlminb(c(-1.2, 1), banana, Dbanana, D2banana)
# took 95 iterations to find the answer to double precision.

To understand this better, I wrote my own version of "nlm" (see below), and learned that the hessian is often indefinite, with one eigenvalue positive and the other negative. If I understand correctly, a negative eigenvalue of the hessian tends to push the next step towards increasing rather than decreasing the function. I tried a few things that accelerated the convergence slightly, but but my "nlm." still had not converged after 100 iterations.

What might be done to improve the performance of something like "nlm" without substantially increasing the overhead for other problems?

```	  Thanks.
spencer graves

#############################

```

nlm. <- function(f=fgh, p=c(-1.2, 1),

# R code version of "nlm"
# requiring analytic gradient and hessian
#
# Initial evaluation

f.i <- f(p)
f0 <- f.i+1
# Iterate

for(i in 1:iterlim){

```     if(sum(df^2)<(gradtol^2)){
return(list(minimum=f.i, estimate=p+dp,
iterations=i))
}

#
d2f <- attr(f.i, "hessian")
dp <- (-solve(d2f, df))

#   Step sufficiently small?

if(sum(dp^2)<(steptol^2)){
return(list(minimum=f.i, estimate=p+dp,
iterations=i))
}

#   Next iter

f0 <- f.i
f.i <- f(p+dp)

#   Step size control

if(f.i>=f0){
for(j in 1:iterlim){
{
if(j==1){
d2f.eig <- eigen(d2f, symmetric=T)
cat("\nstep size control; i=", i,
"; p=", round(p, 3), "; dp=", signif(dp, 2),
"; eig(hessian)=",signif(d2f.eig\$values, 4))
v.max <- (1+max(abs(d2f.eig\$values)))
evec.df <- (t(d2f.eig\$vectors) %*% df)
dp <- (-(d2f.eig\$vectors %*%
}
else{
cat(".")
dp <- dp/2
}
}
f.i <- f(p+dp)
f2 <- f(p+dp/2)
if(f2<f.i){
dp <- dp/2
f.i <- f2
}
if(f.i<f0)break # j
}
if(f.i>=f0){
cat("\n")
return(list(minimum=f0, estimate=p,
hessian=attr(f0, "hessian"), code=3,
iterations=i))
}
}
p <- p+dp
cat(i, p, f.i, "\n")
```

}
return(list(minimum=f.i, estimate=p,
```       gradient=df, hessian=d2f, code=4,
iterations=i))
```

}
```--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

spencer.graves@pdf.com
www.pdf.com <http://www.pdf.com>
Tel:  408-938-4420
Fax: 408-280-7915

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help