Re: [R] RandomForest question

From: Uwe Ligges <ligges_at_statistik.uni-dortmund.de>
Date: Fri 22 Jul 2005 - 00:31:36 EST

Arne.Muller@sanofi-aventis.com wrote:

> Hello,
>
> I'm trying to find out the optimal number of splits (mtry parameter)
> for a randomForest classification. The classification is binary and
> there are 32 explanatory variables (mostly factors with each up to 4
> levels but also some numeric variables) and 575 cases.
>
> I've seen that although there are only 32 explanatory variables the
> best classification performance is reached when choosing mtry=80. How
> is it possible that more variables can used than there are in columns
> the data frame?

If some of the variables are factors, dummy variables are generated and you get a larger number of variables in the later process.

Uwe Ligges

> thanks for your help + kind regards,
>
> Arne
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide! http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Jul 22 01:28:14 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:55 EST