[R] comparing random forests and classification trees

From: Amy Koch <ajkoch_at_postoffice.utas.edu.au>
Date: Mon 29 Jan 2007 - 00:34:51 GMT


I have done an analysis using 'rpart' to construct a Classification Tree. I am wanting to retain the output in tree form so that it is easily interpretable. However, I am wanting to compare the 'accuracy' of the tree to a Random Forest to estimate how much predictive ability is lost by using one simple tree. My understanding is that the error automatically displayed by the two functions is calculated differently so it is therefore incorrect to use this as a comparison. Instead I have produced a table for both analyses comparing the observed and predicted response.

E.g. table(data$dependent,predict(model,type="class"))

I am looking for confirmation that (a) it is incorrect to compare the error estimates for the two techniques and (b) that comparing the misclassification rates is an appropriate method for comparing the two techniques.



Amelia Koch

University of Tasmania

School of Geography and Environmental Studies

Private Bag 78 Hobart

Tasmania, Australia 7001

Ph: +61 3 6226 7454


        [[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon Jan 29 11:39:20 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 30 Jan 2007 - 20:30:30 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.