# [R] MART(tm) vs. gbm

From: manuel.martin <manuel.martin_at_orleans.inra.fr>
Date: Sat 27 May 2006 - 00:31:04 EST

Hello,
I have been using two different implementations of the stochastic gradient boosting (Friedman 2002) : MART(tm) with R and the gbm package. Both are fairly comparable except that the MART with R systematically strongly (depending on the dataset though) outperforms the gbm tool in terms of goodness of fit.
For instance, a

# gbm package

gbm1 <- gbm(Y~X2+X3+X4+X5+X6,

```          data=data,
var.monotone=c(0,0,0,0,0),   #  0: no monotone restrictions

# poisson, and coxph available

n.trees=3000,                # number of trees
shrinkage=0.005,             # shrinkage or learning rate,

# 0.001 to 0.1 usually work

interaction.depth=6,         # 1: additive model, 2: two-way
interactions, etc.
bag.fraction = 0.5,          # subsampling fraction, 0.5 is
probably best
train.fraction = 0.5,        # fraction of data for training,

# first train.fraction*N used for

training
n.minobsinnode = 10,         # minimum total weight needed in
each node
cv.folds = 5,                # do 5-fold cross-validation
keep.data=TRUE,              # keep a copy of the dataset with
the object
verbose=TRUE)                # print out progress

```

# MART with R
X <- as.matrix(cbind(data\$X2,as.numeric(data\$X3), as.numeric(data\$X4),as.numeric(data\$X5),data\$X6)) Y <- data\$Y
mart(X, Y, c(1,2,2,2,1) , niter=3000, tree.size=6, learn.rate=0.005, loss.cri=2 #gaussian too

)

leads to very different goodnesses of fit (I can provide the dataset if needed).

Did anyone already encountered this, is there an explanation, am I missing something obvious in the argument settings? Thank you in advance,

Manuel

R-help@stat.math.ethz.ch mailing list