[R] Variance explained in regression trees?

From: Alexander J. Pries <apries_at_ufl.edu>
Date: Thu 13 Oct 2005 - 01:01:45 EST

I apologize for what may be novice questions but I am new to program R and need a bit of assistance. I am using R to create regression trees to explain how various environmental predictors influence coastal dune loss as a result of hurricane activity.

First question is as follows; how do I interpret the complexity plots that the rpart package will produce. What do the X and Y axis represent (e.g., X-val relative error and cp). My understanding is that "cp" is similar to a complexity penalty for having a tree with many branches when a simpler one would be just as robust. How can I use the values and error bars to interpret what is the "optimal" sized tree?

My other question is as follows; other statistical packages (I'm thinking specifically of DTREG) that build regression trees are able to produce a model summary that explains initial variance, amount of variance explained by the tree, and unexplained variance. From this information, an estimated R-sqr is calculated that provides some indication of how well the tree "fits."

Does R produce, or have the ability, to produce information like this? If anyone has specifics on how I might be able to evaluate the fit of my regression trees.

Thank you in advance for any helpful guidance!

Alex Pries

Alexander Pries
Graduate Student
Wildlife Ecology and Conservation
University of Florida
P.O. Box 110430
Gainesville, FL 32605
(352) 246-9621

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Oct 13 01:10:55 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 18:48:04 EST