Re: [R] randomForest and missing data

From: Bálint Czúcz <czucz_at_botanika.hu>
Date: Tue 09 Jan 2007 - 21:30:48 GMT

There is an improved version of the original random forest algorithm available in the "party" package (you can find some additional information on the details here:
http://www.stat.uni-muenchen.de/sfb386/papers/dsp/paper490.pdf ).

I do not know whether it yields a solution to your problem about missing data, but maybe it's a check worth...

Best regards:

Bálint

On 1/4/07, Darin A. England <england@cs.umn.edu> wrote:
>
> Does anyone know a reason why, in principle, a call to randomForest
> cannot accept a data frame with missing predictor values? If each
> individual tree is built using CART, then it seems like this
> should be possible. (I understand that one may impute missing values
> using rfImpute or some other method, but I would like to avoid doing
> that.)
>
> If this functionality were available, then when the trees are being
> constructed and when subsequent data are put through the forest, one
> would also specify an argument for the use of surrogate rules, just
> like in rpart.
>
> I realize this question is very specific to randomForest, as opposed
> to R in general, but any comments are appreciated. I suppose I am
> looking for someone to say "It's not appropriate, and here's why
> ..." or "Good idea. Please implement and post your code."
>
> Thanks,
>
> Darin England, Senior Scientist
> Ingenix
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 10 08:38:08 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 10 Jan 2007 - 10:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.