Re: [R] randomForest and missing data

From: Torsten Hothorn <Torsten.Hothorn_at_rzmail.uni-erlangen.de>
Date: Wed 10 Jan 2007 - 09:57:27 GMT

On Tue, 9 Jan 2007, Bálint Czúcz wrote:

> There is an improved version of the original random forest algorithm
> available in the "party" package (you can find some additional
> information on the details here:
> http://www.stat.uni-muenchen.de/sfb386/papers/dsp/paper490.pdf ).
>
> I do not know whether it yields a solution to your problem about
> missing data, but maybe it's a check worth...
>

yes, `cforest()' is able to deal with missing values. More specifically, the implementation is based on conditional trees (`ctree()') which are able to set up surrogate splits.

Torsten

> Best regards:
>
> Bálint
>
> On 1/4/07, Darin A. England <england@cs.umn.edu> wrote:
>>
>> Does anyone know a reason why, in principle, a call to randomForest
>> cannot accept a data frame with missing predictor values? If each
>> individual tree is built using CART, then it seems like this
>> should be possible. (I understand that one may impute missing values
>> using rfImpute or some other method, but I would like to avoid doing
>> that.)
>>
>> If this functionality were available, then when the trees are being
>> constructed and when subsequent data are put through the forest, one
>> would also specify an argument for the use of surrogate rules, just
>> like in rpart.
>>
>> I realize this question is very specific to randomForest, as opposed
>> to R in general, but any comments are appreciated. I suppose I am
>> looking for someone to say "It's not appropriate, and here's why
>> ..." or "Good idea. Please implement and post your code."
>>
>> Thanks,
>>
>> Darin England, Senior Scientist
>> Ingenix
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 10 21:03:15 2007

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Wed 10 Jan 2007 - 10:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.