[R] Fwd: Classification error rate increased by bagging - any ideas?

From: Anthony Staines <anthony.staines_at_gmail.com>
Date: Thu 20 Jul 2006 - 01:40:28 EST


I'm analysing some anthropometric data on fifty odd skull bases. We know the gender of each skull, and we are trying to develop a predictor to identify the sex of unknown skulls.

Rpart with cross-validation produces two models - one of which predicts gender for Males well, and Females poorly, and the other does the opposite (Females well, and Males poorly). In both cases the error rate for the worse predicted gender is close to 50%, and for the better predicted gender about 15%.

Bagging tree models produces a model which classifies both males and females equally well (or equally poorly), but has an overall error rate (just over 30%) higher than either of the rpart models (about 25%).

My instinct is to go for the bagging results, as they seem more reasonable, but my colleagues really like the lower overall error rate. Any thoughts?

Anthony Staines--
Dr. Anthony Staines, Senior Lecturer in Epidemiology. School of Public Health and Population Sciences, UCD, Earlsfort Terrace, Dublin 2, Ireland.
Tel:- +353 1 716 7345. Fax:- +353 1 716 7407 Mobile:- +353 86 606 9713 Web:- http://phm.ucd.ie

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Jul 20 01:57:53 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 20 Jul 2006 - 02:16:59 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.