[R] More appropriate optimization routine?

From: Dimitri Liakhovitski <dimitri.liakhovitski_at_gmail.com>
Date: Mon, 07 Mar 2011 17:10:55 -0500


Hello!

I have 2 variables - predictor "pred" and response variable "DV":

pred<-c(439635.053, 222925.718, 668434.755, 194242.330, 5786.321, 115537.344,  100835.368, 7133.206, 159058.286, 4079991.629, 3380078.060, 2661279.136, 2698324.478, 1245213.965, 1901815.503, 1517019.451, 1396857.736, 1034030.988, 432249.574, 342329.325, 1831335.792, 2209578.859, 1641709.754, 1329308.669, 1251794.367, 731368.430, 1705626.983, 673535.171, 242519.280, 57251.998,

5728.821, 2054514.244, 301954.819, 773955.355, 735497.506, 347355.976,
1678175.153, 133082.395, 591326.289, 30866.182, 27235.846, 118372.342,
71590.969, 84813.299, 366146.153, 1391725.205, 763199.746, 1216661.202,
263878.157, 930832.769, 261270.130, 589303.561, 455137.946, 954655.201, 873434.054)
(pred)

DV<-c(0.55351297,0.27616943,0.58134926,0.33887159,0.03092546,0.14928061,
0.11836759,0.01719463,0.03258188,1.81205587,2.86657699,2.49491195,
3.09727230,1.95648776,2.28106268,1.78978179,1.74003678,1.22520393,
0.54245878,0.41483039,1.08731493,2.19581289,1.60516129,1.30723431,
1.41822649,1.31530539,2.02406576,1.22211412,0.52055790,0.12975522,
0.01416903,0.61043485,0.44141748,0.64327070,0.53607039,0.32603820,
1.77261016,0.42035756,0.37853917,0.12342486,0.06607710,0.02383682,
0.08421590,0.09255332,0.23644909,1.67921092,1.26864432,1.38654574,
1.29833020,1.76873555,0.93363677,1.01857658,0.81359775,2.14758239,2.41583852)

(DV)

Both "pred" and "DV" above are time series (observed across 55 months). The relationship between them is pre-specified. In this relationship, the (predicted) "DV" at time t is a specific function of itself at time t-1, of "pred" at time t, and of 2 scalars - a and b. I have to find optimal a and b that would ensure the best fit between the observed DV and the predicted DV. Below is the function I have to optimize:

my.function <- function(param){
  a<-param[1]
  b<-param[2]
  DV_pred <- rep(0,length(pred))
  for(i in 2:length(pred)){

         DV_pred[i] <- 1 - ( (1 - DV_pred[i-1] * a) / (exp(pred[i] * b)) )   }
  DV_pred[1]<-DV[1]
  correl <- cor(DV,DV_pred)
  return(correl)
}

a has to be between 0.001 and 0.75
b has to be positive.
I apologize if it's a simple question for statisticians but I am not a mathematician/statistician by training. I didn't think optim would work here. The only thing I could think of is genetic optimization, for example in rgenoud below. However, I don't think I could use it for 2 reasons: (1) Solutions do not seem stable and depend on starting values, set.seed, and domains chosen; (2) It takes too long - I have to do the task I described above about 1,200 times and it would take me forever.
Could someone maybe provide a pointer: what would be a faster/more efficient optimization approach for such a function as mine? Thanks a lot!

library(rgenoud)
genoud.opt<-genoud(my.function, nvars=2, max=TRUE, pop.size=10000, hard.generation.limit=TRUE,

	max.generations=20, wait.generations=5, starting.values=c(0.5,1),
	Domains=matrix(c(0.001,0.75,0.00000001,2),ncol=2,byrow=T),
boundary.enforcement=2)
-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Mon 07 Mar 2011 - 22:36:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Mar 2011 - 03:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive