From: Trevor Wiens <twiens_at_interbaun.com>

Date: Thu 17 Mar 2005 - 11:59:01 EST

*}
*

pred.cc <- unlist(pred.c)

delta.cv.k <- sum(sorted$rvar != pred.cc)/n p.k <- countlist/n

delta.app <- mean(sorted$rvar != pred.all)/n

Date: Thu 17 Mar 2005 - 11:59:01 EST

http://www.bioconductor.org/workshops/NGFN03/modelsel-exercise.pdf

Now I've tried to adapt it to my purposes, but since I'm not deeply familiar with R programming, I don't know why it doesn't work. Now checking the r-help list faq it seems this is an appropriate question.

I've included my attempted function below. The error I get is:

logcv(basp.data, form, 'basp', 'recordyear') Error in order(na.last, decreasing, ...) :

Argument 1 is not a vector

My questions are, why doesn't this work, and how do I fix it.

I'm using the formula function to create the formula that I'm sending to my function. And the mdata is a data.frame. I'm assumed that if I passed the column names as strings (response variable - rvar, fold variable - fvar) this would work. Apparently however it doesn't.

Lastly, since I don't have access to MASS and there are apparently many examples of doing this kind of thing in MASS, could someone tell me if this function looks approximately correct?

Thanks

T

logcv <- function(mdata, formula, rvar, fvar) { require(Hmisc)

# sort by fold variable

sorted <- mdata[order(mdata$fvar), ]

# get fold values and count for each group
vardesc <- describe(sorted$fvar)$values

fvarlist <- as.integer(dimnames(vardesc)[[2]])
k <- length(fvarlist)

countlist <- vardesc[1,1]

for (i in 2:k)

{

countlist[i] <- vardesc[1,i]

*}
*

n <- length(sorted$fvar)

# fit to all the mdata

fit.all <- glm(formula, sorted, family=binomial)
pred.all <- ifelse( predict(fit.all, type="response") < 0.5, 0, 1)

#setup

pred.c <- list()

error.i <- vector(length=k)

for (i in 1:k)

{

fit.i <- glm(formula, subset(sorted, sorted$fvar != fvarlist[i]), family=binomial)

pred.i <- ifelse(predict(fit.i, newdata=subset(sorted, sorted$fvar == fvarlist[i]), type="response") < 0.5, 0, 1) pred.c[[i]] = pred.i pred.all.i <- ifelse(predict(fit.i, newdata=sorted, type="response") < 0.5, 0, 1)error.i[i] <- sum(sorted$rvar != pred.all.i)/n

pred.cc <- unlist(pred.c)

delta.cv.k <- sum(sorted$rvar != pred.cc)/n p.k <- countlist/n

delta.app <- mean(sorted$rvar != pred.all)/n

delta.acv.k <- delta.cv.k + delta.app - sum(p.k*error.i)

print(delta.acv.k)

*}
*

-- Trevor Wiens twiens@interbaun.com ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Thu Mar 17 12:04:21 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:30:50 EST
*