Re: [R] boosting - second posting

From: Weiwei Shi <helprhelp_at_gmail.com>
Date: Wed 31 May 2006 - 01:27:09 EST

I remember if you use distribution=bernoulli, then you don't have to as.factor(your_response_variable) either.

Weiwei

On 5/30/06, Kuhn, Max <Max.Kuhn@pfizer.com> wrote:
>
> The family arg appears to be the problem. Either bernoulli or adaboost
> are appropriate for classification problems.
>
> Max
>
> > Perhaps by following the Posting Guide you're likely to get more
> helpful
> > responses. You have not shown an example that others can reproduce,
> not
> > given version information for R or gbm. The output you showed does
> not use
> > type="response", either.
> >
> > Andy
> >
> > _____
> >
> > From: r-help-bounces at stat.math.ethz.ch on behalf of stephenc
> > Sent: Sat 5/27/2006 4:02 PM
> > To: 'R Help'
> > Subject: [R] boosting - second posting [Broadcast]
> >
> >
> >
> > Hi
> >
> > I am using boosting for a classification and prediction problem.
> >
> > For some reason it is giving me an outcome that doesn't fall between 0
>
> > and 1 for the predictions. I have tried type="response" but it made
> no
> > difference.
> >
> > Can anyone see what I am doing wrong?
> >
> > Screen output shown below:
> >
> >
> > > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula
>
> > + data=train, # dataset
> > + # +1: monotone increase,
> > + # 0: no monotone restrictions
>
> > + distribution="gaussian", # bernoulli, adaboost,
> gaussian,
> > + # poisson, and coxph available
>
> > + n.trees=3000, # number of trees
> > + shrinkage=0.005, # shrinkage or learning rate,
> > + # 0.001 to 0.1 usually work
> > + interaction.depth=3, # 1: additive model, 2:
> two-way
> > interactions, etc.
> > + bag.fraction = 0.5, # subsampling fraction, 0.5 is
>
> > probably best
> > + train.fraction = 0.5, # fraction of data for
> training,
> > + # first train.fraction*N used
> > for training
> > + n.minobsinnode = 10, # minimum total weight needed
> in
> > each node
> > + cv.folds = 5, # do 5-fold cross-validation
> > + keep.data=TRUE, # keep a copy of the dataset
> > with the object
> > + verbose=FALSE) # print out progress
> > >
> > > best.iter = gbm.perf(boost.model,method="cv")
> > > pred = predict.gbm(boost.model, test, best.iter)
> > > summary(pred)
> > Min. 1st Qu. Median Mean 3rd Qu. Max.
> > 0.4772 1.5140 1.6760 1.5100 1.7190 1.9420
> ----------------------------------------------------------------------
> LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

	[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Wed May 31 01:33:01 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 31 May 2006 - 02:10:21 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.