From: Ross Boylan <ross_at_biostat.ucsf.edu>

Date: Thu, 07 Feb 2008 20:49:18 -0500

So, if I don't do any other rescaling, I might say ndeps=c(1e-2, 1e3)

in the previous example (response to x[1] is 10 times flatter than to x[2]).

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 08 Feb 2008 - 01:54:45 GMT

Date: Thu, 07 Feb 2008 20:49:18 -0500

?optim says, in describing the control parameter,

'fnscale' An overall scaling to be applied to the value of 'fn'

and 'gr' during optimization. If negative, turns the problem into a maximization problem. Optimization is performed on 'fn(par)/fnscale'.

'parscale' A vector of scaling values for the parameters.

Optimization is performed on 'par/parscale' and these should be comparable in the sense that a unit change in any element produces about a unit change in the scaled value.

- Does the final phrase 'produces about a unit change in the scaled value' refer to the value of the objective function? Substantively I think it must, though grammatically it's less clear.
- "Optimization is performed on 'par/parscale'" means a) if par is 3 and parscale is 10 then the objective function will be evaluated at .3. This strikes me as the literal reading of what the clause means; it also strikes me as extremely unlikely this is what really happens. or b) if par is 3 and parscale is 10 then the objective function is evaluated at 3. The optimizer records this as if par were 30, and subsequently, e.g. when computing deltas or making steps, does so in this space. So a step of d becomes a step of d/parscale for the real objective function. c) About the same as b, only steps of d become d*parscale.
- Does scaling affect any of the final results (including log-likelihood, std errors, ...), assuming the scaled and unscaled methods find the same untransformed point?

I assume that scaling is transparent in the sense of 3, i.e. does not
affect any of the reported results (unless it changes how well the
optimizer works or fnscale converts minimizing to maximizing). Even
given that, suppose I think that

f(x)-f(x1) approx equals f(x)-f(x2) where
x1[1] = x[1] + 10 and

x2[2] = x[2] + 1, and x, x1, and x2 are otherwise equal.
Does this mean I should have parscale = c(10, 1) or parscale= (1/10, 1)?

Since I'm not sure about parscale, I'm really not sure about

'ndeps' A vector of step sizes for the finite-difference

approximation to the gradient, on 'par/parscale' scale. Defaults to '1e-3'.

So, if I don't do any other rescaling, I might say ndeps=c(1e-2, 1e3)

in the previous example (response to x[1] is 10 times flatter than to x[2]).

I guess that if I do have parscale set, I leave the default ndeps (1e-3 for both) and get the same effect. Right?

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 08 Feb 2008 - 01:54:45 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 08 Feb 2008 - 06:30:13 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*