Re: [R] [handling] Missing [values in randomForest]

From: Kevin Bartz <bartzk_at_yahoo-inc.com>
Date: Tue 13 Sep 2005 - 09:17:20 EST


Hi Jan-Paul,

You definitely want to be careful with na.omit in randomForest -- that wipes out any row with even one NA. If NAs are sprawled throughout your dataset, na.omit might end up killing a lot of rows. Here's my usual MO for missing values:

  1. "impute" in Hmisc fills in gaps with the mean, median, most common value, etc.
  2. rfImpute: fits a forest on the rows available and uses it to predict the missing values.
  3. aregImpute: similar to rfImpute, but using a linear model.
  4. You may want to consider using a single tree ("rpart" package) in this case instead of a forest. Single trees deal with missing values cleanly through surrogate splits.

Good luck!

Kevin

-----Original Message-----
From: r-help-bounces@stat.math.ethz.ch
[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Uwe Ligges Sent: Sunday, September 11, 2005 3:44 AM To: Jan-Paul Roodbol
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] [handling] Missing [values in randomForest]

Jan-Paul Roodbol wrote:

> Does anyone know if randomForest in R can handle
> dataset with missings?

See ?randomForest, you can omit observations including NAs by specifying

na.action=na.omit

Please do not cross-post!
Please specify a sensible subject!

Uwe Ligges

> Thank you
>
> Kind regards
>
> Jan-Paul
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue Sep 13 09:27:56 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 16:57:12 EST