[R] How to calculate the generalization error of random forests?

From: Martin Lam <tmlammail_at_yahoo.com>
Date: Fri 10 Feb 2006 - 03:18:50 EST


Hi,

Perhaps this is not the proper place to ask this question but I am out of options, therefore I apologize in advance.

I want to know how the (upper bound?) generalization error of the random forest is determined using the out-of-bag estimate. I read in Breiman's paper that s and p determine the generalization error: p(1-s^2)/s^2.
Does s stands for the strength of the individual tree or of the entire ensemble? p stands for the correlation between the trees.

If I have, let's say, built 3 trees in my forest and I know for each tree the instances that were left out during training, how do I calculate s and p, so I can calculate the error?

Thanks in advance,

Martin



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Feb 10 04:36:04 2006

This archive was generated by hypermail 2.1.8 : Sat 11 Feb 2006 - 02:03:59 EST