Re: [R] comparing random forests and classification trees

From: Darin A. England <england_at_cs.umn.edu>
Date: Tue 30 Jan 2007 - 19:32:07 GMT

Amy,

I have also had this issue with randomForest, that is, you lose the ability to explain the classifier in a simple way to non-specialists (everyone can understand the single decision tree.) As far as comparing the accuracy of the two, I think that you are correct in comparing them by the actual vs predicted tables. randomForest reports this as the confusion matrix, and it also reports the out-of-bag error, which I think you are referring to. I would not compare the rf out-of-bag error with the rpart relative error (or cross-validated error if you are doing cross validation.)

So, for what it's worth I think you are correct. Also, do you know about ctree in the "party" package? If you want to retain the explanatory power of a single tree and have a nice accurate classifier, I have found ctree to work quite well.

HTH, Darin

On Mon, Jan 29, 2007 at 11:34:51AM +1100, Amy Koch wrote:
> Hi,
>
> I have done an analysis using 'rpart' to construct a Classification Tree. I
> am wanting to retain the output in tree form so that it is easily
> interpretable. However, I am wanting to compare the 'accuracy' of the tree
> to a Random Forest to estimate how much predictive ability is lost by using
> one simple tree. My understanding is that the error automatically displayed
> by the two functions is calculated differently so it is therefore incorrect
> to use this as a comparison. Instead I have produced a table for both
> analyses comparing the observed and predicted response.
>
> E.g. table(data$dependent,predict(model,type="class"))
>
> I am looking for confirmation that (a) it is incorrect to compare the error
> estimates for the two techniques and (b) that comparing the
> misclassification rates is an appropriate method for comparing the two
> techniques.
>
> Thanks
>
> Amy
>
>
>
>
>
> Amelia Koch
>
> University of Tasmania
>
> School of Geography and Environmental Studies
>
> Private Bag 78 Hobart
>
> Tasmania, Australia 7001
>
> Ph: +61 3 6226 7454
>
> ajkoch@utas.edu.au
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 31 06:47:37 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 31 Jan 2007 - 15:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.