Re: [R] User defined split function in Rpart

From: Terry Therneau <therneau_at_mayo.edu>
Date: Thu, 14 Feb 2008 08:33:03 -0600 (CST)


  The question is about the direction vector in rpart.   

  There are (at least) two preferred ways to lay out a tree, wrt the question of which obs are sent left and which right.

  1. Send the smaller y values to the left. In the final tree, there will be a graphical ordering with smaller y's to the left and larger ones to the right. One has a "left bad, right good" orientation when traversing the tree. I find that medical researchers often like this.
  2. Send observations with x < cutpoint to the left. Setting all elements of the direction vector to -1 will give this behavior.

    I happen to slightly prefer option 1, which of course means that it became the default behavior in rpart. (For a categorical y with many levels, however, rpart orders on the percent of observations in category 1, which may not be particularly useful.)          

            Terry Therneau



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 14 Feb 2008 - 14:50:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 14 Feb 2008 - 15:30:14 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive