[R] MART(tm) vs. gbm

From: manuel.martin <manuel.martin_at_orleans.inra.fr>
Date: Sat 27 May 2006 - 00:31:04 EST

I have been using two different implementations of the stochastic gradient boosting (Friedman 2002) : MART(tm) with R and the gbm package. Both are fairly comparable except that the MART with R systematically strongly (depending on the dataset though) outperforms the gbm tool in terms of goodness of fit.
For instance, a

# gbm package

   gbm1 <- gbm(Y~X2+X3+X4+X5+X6,

          var.monotone=c(0,0,0,0,0),   #  0: no monotone restrictions
          distribution="gaussian",     # bernoulli, adaboost, gaussian,

# poisson, and coxph available
n.trees=3000, # number of trees shrinkage=0.005, # shrinkage or learning rate,
# 0.001 to 0.1 usually work
interaction.depth=6, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5, # fraction of data for training,
# first train.fraction*N used for
training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 5, # do 5-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=TRUE) # print out progress

# MART with R
X <- as.matrix(cbind(data$X2,as.numeric(data$X3), as.numeric(data$X4),as.numeric(data$X5),data$X6)) Y <- data$Y
mart(X, Y, c(1,2,2,2,1) , niter=3000, tree.size=6, learn.rate=0.005, loss.cri=2 #gaussian too


leads to very different goodnesses of fit (I can provide the dataset if needed).

Did anyone already encountered this, is there an explanation, am I missing something obvious in the argument settings? Thank you in advance,


R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sat May 27 00:44:17 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 27 May 2006 - 02:10:22 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.