From: Sundar Dorai-Raj <sundar.dorai-raj_at_pdf.com>

Date: Thu 04 Aug 2005 - 20:24:41 EST

beta <- rnorm(p)

y <- drop(x %*% beta + rnorm(n))

start <- rep(0, p)

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Aug 04 20:41:34 2005

Date: Thu 04 Aug 2005 - 20:24:41 EST

nwew wrote:

> Dear R-helpers,

*>
**> The function optim implements algorithms that I would like to use.
**>
**>
**> I have function implemented in R, which given the parameters of which
**> minimization is to take place returns a scalar as well as the gradient.
**>
**> Unfortunately optim requires two function _fn_ and _gr_ where fn returns the
**> function value and gr the gradient. Splitting my function in two functions
**> would be easy, however I am wondering if evaluating both is not doubling the
**> the very high computational costs. Most of the computational intensive
**> operations are identical if computing the function value and gradient.
**>
**> Question: is there a way to tweek optim that only one function evaluation is
**> necessary? Are ther other implementations of the algorithm, which do assume
**> that the function to be minimized returns the function value and the gradient
**> as well?
**>
**> Thanks
**> Eryk.
**>
*

Hi, Eryk,

?optim does not *require* the "gr" argument. If one is not supplied, numerical gradients are used for which optim evaluates "fn" twice. However, in many cases analytical gradients can both improve numerical accuracy and computational speed.

Trivial example:

fn <- function(beta) {

sum((y - x %*% beta)^2)

}

gr <- function(beta) {

colSums(-2 * x * drop((y - x %*% beta))) }

set.seed(1)

n <- 10000 p <- 5 g <- factor(rep(LETTERS[1:p], each = n/p)) x <- model.matrix(~g)

beta <- rnorm(p)

y <- drop(x %*% beta + rnorm(n))

start <- rep(0, p)

system.time(f1 <- optim(start, fn, gr, hessian = TRUE)) # [1] 0.47 0.00 0.48 NA NA

system.time(f2 <- optim(start, fn, hessian = TRUE)) # [1] 0.54 0.00 0.53 NA NA

f1$par

#[1] -0.6408643 0.2128109 -0.8378791 1.5983054 0.3366216

f2$par

#[1] -0.6408643 0.2128109 -0.8378791 1.5983054 0.3366216

f1$hessian

# [,1] [,2] [,3] [,4] [,5] #[1,] 20000 4000 4000 4000 4000 #[2,] 4000 4000 0 0 0 #[3,] 4000 0 4000 0 0 #[4,] 4000 0 0 4000 0 #[5,] 4000 0 0 0 4000 f2$hessian # [,1] [,2] [,3] [,4] [,5] #[1,] 20000 4.000000e+03 4000 4.000000e+03 4000 #[2,] 4000 4.000000e+03 0 2.273737e-07 0 #[3,] 4000 0.000000e+00 4000 0.000000e+00 0 #[4,] 4000 2.273737e-07 0 4.000000e+03 0#[5,] 4000 0.000000e+00 0 0.000000e+00 4000 #

Does this answer your question? Furthermore, have you read the chapter in MASS that discusses optim?

**HTH,
**
--sundar

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Aug 04 20:41:34 2005

*
This archive was generated by hypermail 2.1.8
: Sun 23 Oct 2005 - 15:03:55 EST
*