RE: [R] rpart question

About this list Date view Thread view Subject view Author view Attachment view

From: Liaw, Andy (andy_liaw@merck.com)
Date: Wed 05 May 2004 - 09:48:16 EST


Message-id: <3A822319EB35174CA3714066D590DCD504AF7D07@usrymx25.merck.com>

AFAIK rpart does not have built-in facility for adjusting bias in split
selection. One possibility is to define your own splitting criterion that
does the adjustment is some fashion. I believe the current version of rpart
allows you to define custom splitting criterion, but I have not tried it
myself.

Prof. Wei-yin Loh at UW-Madison (and his current and former students) had
worked on algorithms that compensate for bias in split selection. There are
software on his web page that you might want to check out.

HTH,
Andy

> From: lsjensen@micron.com
>
> Wondered about the best way to control for input variables that have a
> large number of levels in 'rpart' models. I understand the algorithm
> searches through all possible splits (2^(k-1) for k levels) and so
> variables with more levels are more prone to be good
> spliters... so I'm
> looking for ways to compensate and adjust for this complexity.
>
> For example, if two variables produce comparable splits in
> the data but
> one contains 2 levels and the other 13 levels then I would
> like to have
> to have the algorithm choose the 'simpler' split.
>
> Is this best done with the 'cost' argument in the rpart options? This
> defaults to one for all variables... so would it make sense to scale
> this by nlevels in each variable or sqrt(nlevels) or
> something similar?
>
> Thanks,
> Landon
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>

------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:07 EST