[R] model seleciton by leave-one-out cross-validation

From:  <klijunjie_at_gmail.com>
Date: Fri, 11 May 2007 22:49:45 +0800


Hi, all

When I am using mle.cv(wle), I find a interesting problem: I can't do leave-one-out cross-validation with mle.cv(wle). I will illustrate the problem as following:

> xx=matrix(rnorm(20*3),ncol=3)
> bb=c(1,2,0)
> yy=xx%*%bb+rnorm(20,0,0.001)+0
> summary(mle.cv(yy~xx,split=nrow(xx)-1,monte.carlo=2*nrow(xx),verbose=T),
num.max=1)[[1]]
mle.cv: dimension of the split subsample set to default value = 9  (Intercept) xx1 xx2 xx3 cv 0.000000e+00 1.000000e+00 1.000000e+00 0.000000e+00 1.292513e-06

So does anybody know how to do linear model selection by leave-one-out cross-validation? I've written one function, but it runs toooooo slow~~~

Thanks firstly

This is my super slow function:

####################
## function: rec.comb
## input: vec -- vector

## output: all possible combination from the elements of vec
####################

rec.comb=function(vec)
{
  if(length(vec)==0){list(NULL)
  }else { tmp=rec.comb(vec[-1])
    tmp2=sapply(tmp,function(x)c(vec[1],x))
    c(tmp,tmp2)
  }
}
####################
## function: CV1glm--CV1 using K fold CV
## input: y -- response vector; x -- predictors without intercept ## output: the vector whether each predictor should be selected. E.g.  (0,1,0,1) means no intecept while var1 and var3 should be selected
####################

CV1glm=function(y,x){
 n.var=ncol(x)
 n=nrow(x)
 comb=rec.comb(1: (n.var))
 n.comb=length(comb)
 pe=c()

 data=data.frame(y=y)
 glm=glm(y~1,data=data)
 pe[1]=cv.glm(data,glm)$delta[1]
 for(i in 2:n.comb){
  data=data.frame(y=y,x=x[,comb[[i]]])
  glm=glm(y~.,data=data)
  pe[i]=cv.glm(data,glm)$delta[1]
 }

 pe1=c() ####################################without intercept
 pe1[1]=Inf
 for(i in 2:n.comb){
  data=data.frame(y=y,x=x[,comb[[i]]])
  glm=glm(y~.-1,data=data)
  pe1[i]=cv.glm(data,glm)$delta[1]
 }

 var=rep(0,n.var)
 if(min(pe)<min(pe1)){
  int=1
  var[comb[[which(pe==min(pe))]]]=1
 }else{
  int=0
  var[comb[[which(pe1==min(pe1))]]]=1
 }
 c(int,var)
}

-- 
Junjie Li,                  klijunjie_at_gmail.com
Undergranduate in DEP of Tsinghua University,

	[[alternative HTML version deleted]]

______________________________________________
R-help_at_stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 11 May 2007 - 14:56:33 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 11 May 2007 - 16:31:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.