From: Liaw, Andy <andy_liaw_at_merck.com>

Date: Fri 22 Jul 2005 - 02:59:33 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 22 03:03:39 2005

Date: Fri 22 Jul 2005 - 02:59:33 EST

See the tuneRF() function in the package for an implementation of
the strategy recommended by Breiman & Cutler.

BTW, "randomForest" is only for the R package. See Breiman's web page for notice on trademarks.

Andy

*> From: Weiwei Shi
**>
**> Hi,
*

> I found the following lines from Leo's randomForest, and I am not sure

*> if it can be applied here but just tried to help:
**>
**> mtry0 = the number of variables to split on at each node. Default is
**> the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF
**> MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY
**> DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT
**> GIVES THE SMALLEST OOB ERROR RATE.
**>
**> mdim is the number of predicators.
**>
**> HTH,
**>
**> weiwei
**>
**> On 7/21/05, Liaw, Andy <andy_liaw@merck.com> wrote:
**> > > From: Arne.Muller@sanofi-aventis.com
**> > >
**> > > Hello,
**> > >
**> > > I'm trying to find out the optimal number of splits (mtry
**> > > parameter) for a randomForest classification. The
**> > > classification is binary and there are 32 explanatory
**> > > variables (mostly factors with each up to 4 levels but also
**> > > some numeric variables) and 575 cases.
**> > >
**> > > I've seen that although there are only 32 explanatory
**> > > variables the best classification performance is reached when
**> > > choosing mtry=80. How is it possible that more variables can
**> > > used than there are in columns the data frame?
**> >
**> > It's not. The code for randomForest.default() has:
**> >
**> > ## Make sure mtry is in reasonable range.
**> > mtry <- max(1, min(p, round(mtry)))
**> >
**> > so it silently sets mtry to number of predictors if it's too large.
**> > As an example:
**> >
**> > > library(randomForest)
**> > randomForest 4.5-12
**> > Type rfNews() to see new features/changes/bug fixes.
**> > > iris.rf = randomForest(Species ~ ., iris, mtry=10)
**> > > iris.rf$mtry
**> > [1] 4
**> >
**> > I should probably add a warning in such cases...
**> >
**> > Andy
**> >
**> >
**> > > thanks for your help
**> > > + kind regards,
**> > >
**> > > Arne
**> > >
**> > >
**> > >
**> > >
**> > > [[alternative HTML version deleted]]
**> > >
**> > > ______________________________________________
**> > > R-help@stat.math.ethz.ch mailing list
**> > > https://stat.ethz.ch/mailman/listinfo/r-help
**> > > PLEASE do read the posting guide!
**> > > http://www.R-project.org/posting-guide.html
**> > >
**> > >
**> > >
**> >
**> > ______________________________________________
**> > R-help@stat.math.ethz.ch mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide!
**> http://www.R-project.org/posting-guide.html
**> >
**>
**>
**> --
**> Weiwei Shi, Ph.D
**>
**> "Did you always know?"
**> "No, I did not. But I believed..."
**> ---Matrix III
**>
**>
*

>

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 22 03:03:39 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:33:55 EST
*