[R] RandomForest vs. bayes & svm classification performance

From: Eleni Rapsomaniki <e.rapsomaniki_at_mail.cryst.bbk.ac.uk>
Date: Tue 25 Jul 2006 - 03:59:31 EST


This is a question regarding classification performance using different methods. So far I've tried NaiveBayes (klaR package), svm (e1071) package and randomForest (randomForest). What has puzzled me is that randomForest seems to perform far better (32% classification error) than svm and NaiveBayes, which have similar classification errors (45%, 48% respectively). A similar difference in performance is observed with different combinations of parameters, priors and size of training data.

Because I was expecting to see little difference in the perfomance of these methods I am worried that I may have made a mistake in my randomForest call:

my.rf=randomForest(x=train.df[,-response_index], y=train.df[,response_index], xtest=test.df[,-response_index], ytest=test.df[,response_index], importance=TRUE,proximity=FALSE, keep.forest=FALSE)

(where train.df and test.df are my train and test data.frames and response_index is the column number specifiying the class)

My main question is: could there be a legitimate reason why random forest would outperform the other two models (e.g. maybe one method is more reliable with Gaussian data, handles categorical data better etc)? Also, is there a way of evaluating the predictive ability of each parameter in the bayesian model as it can be done for random Forests (through the importance table)?

I would appreciate any of your comments and suggestions on these.

Many thanks
Eleni Rapsomaniki

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue Jul 25 04:16:22 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Fri 28 Jul 2006 - 06:17:21 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.