Re: [R] How to improve the "OPTIM" results

From: Spencer Graves <spencer.graves_at_pdf.com>
Date: Sat, 05 Apr 2008 15:17:08 -0700

Dear Katheryn:

      I'm confused by your claim that, "Even though I used the true parameter values as initial values, the results are not very good."

      When I ran it, 'optim' quit with $value = -35835, substantially less than fr2(theta0) = -0.3.

      Could you please review your question, because I believe this will answer it.

MORE GENERAL OPTIM ISSUES       It is known that 'optim' has problems. Perhaps the simplest thing to do is to call 'optim' with each of the 'methods' in sequence, using the 'optim' found by each 'method' as the starting value for the next. When I do this, I often skip 'SANN', because it typically takes so much more time than the other methods. However, if there might be multiple local minima, then SANN may be the best way to find a global minimum, though you may want to call 'optim' again with another method, starting from optimal solution returned by 'SANN'.

      Another relatively simple thing to do is to call optim with hessian = TRUE and then compute eigen(optim(..., hessian=TRUE)$hessian, symmetric=TRUE). If optim found an honest local minimum, all the eigenvalues will be positive. For your example, the eigenvalues were as follows:

19000, 11000,

     0.06, 0.008, 0.003, 
     -0.0002, -0.06,

-6000

      This says that locally, the "minimum" identified was really a saddle point, being a parabola opening up in two dimensions (with curvatures of 19000 and 11000 respectively), a parabola opening down in one dimension (with curvature -6000), and relatively flat by comparison in the other 5 dimensions. In cases like this, it should be moderately easy and fast to explore the dimension with the largest negative eigenvalue with the other dimensions fixed until local minima in each direction were found. Then 'optim' could be restarted from both those local minima, and the minimum of those two solutions could be returned.

      If all eigenvalues were positive, it might then be wise to restart with parscale = the square roots of the diagonal of the hessian; I haven't tried this, but I believe it should work. Using 'parscale' can fix convergence problems -- or create some where they did not exist initially.

      I'm considering creating a package 'optimMLE' that would automate some of this and package it with common 'methods' that would assume that sum(fn(...)) was either a log(likelihood) or the negative of a log(likelihood), etc. However, before I do, I need to make more progress on some of my other commitments, review RSiteSearch("optim", "fun") to make sure I'm not duplicating something that already exists, etc. If anyone is interested in collaborating on this, please contact me off-line.

      Hope this helps. 
      Spencer
     

kathie wrote:
> Dear R users,
>
> I used to "OPTIM" to minimize the obj. function below. Even though I used
> the true parameter values as initial values, the results are not very good.
>
> How could I improve my results? Any suggestion will be greatly appreciated.
>
> Regards,
>
> Kathryn Lord
>
> #------------------------------------------------------------------------------------------
>
> x = c(0.35938587, 0.34889725, 0.20577608, 0.15298888, -4.24741146,
> -1.32943326,
> 0.29082527, -2.38156942, -6.62584473, -0.22596237, 6.97893687,
> 0.22001081,
> -1.39729222, -5.17860124, 0.52456484, 0.46791660, 0.03065136,
> 0.32155177,
> -1.12530013, 0.02608057, -0.22323154, 3.30955460, 0.54653982,
> 0.29403011,
> 0.47769874, 2.42216260, 3.93518355, 0.23237890, 0.09647044,
> -0.48768933,
> 0.37736377, 0.43739341, -0.02416010, 4.02788119, 0.07320802,
> -0.29393054,
> 0.25184609, 0.76044448, -3.34121918, 1.16028677, -0.60352008,
> -2.86454069,
> -0.84411691, 0.24841071, -0.11764954, 5.92662106, 1.03932953,
> -6.21987657,
> -0.54763352, 0.20263192) # data
>
> theta0= c(log(2),0,c(0,-.3,.3),log(c(10,.05,.05))) # initial value(In fact,
> true parameter value)
> n = length(x)
>
> fr2 = function(theta)
> {
> a1 = theta[1]; a2 = theta[2]
> mu1 = theta[3]; mu2 = theta[4]; mu3 = theta[5]
> g1 = theta[6]; g2 = theta[7]; g3 = theta[8]
>
> w1=exp(a1)/(1+exp(a1)+exp(a2))
> w2=exp(a2)/(1+exp(a1)+exp(a2))
> w3=1-w1-w2
>
> obj =((w1^2)/(2*sqrt(exp(g1)*pi))
> + (w2^2)/(2*sqrt(exp(g2)*pi))
> + (w3^2)/(2*sqrt(exp(g2)*pi))
>
> + 2*w1*w2*dnorm((mu1-mu2)/sqrt(exp(g1)+exp(g2)))/sqrt(exp(g1)+exp(g2))
> +
> 2*w1*w3*dnorm((mu1-mu3)/sqrt(exp(g1)+exp(g3)))/sqrt(exp(g1)+exp(g3))
> +
> 2*w2*w3*dnorm((mu2-mu3)/sqrt(exp(g2)+exp(g3)))/sqrt(exp(g2)+exp(g3))
>
> - (2/n)*(sum(w1*dnorm((x-mu1)/sqrt(exp(g1)))/sqrt(exp(g1))
> + w2*dnorm((x-mu2)/sqrt(exp(g2)))/sqrt(exp(g2))
> + w3*dnorm((x-mu3)/sqrt(exp(g3)))/sqrt(exp(g3)) )))
>
> return(obj)
> }
>
> optim(theta0, fr2, method=BFGS", control = list (fnscale=10,
> parscale=c(.01,.01,.01,.01,.01,1,1,1), maxit=100000))
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 05 Apr 2008 - 22:20:41 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 06 Apr 2008 - 15:00:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive