Re: [R] use "caret" to rank predictors by random forest model

From: mxkuhn <mxkuhn_at_gmail.com>
Date: Mon, 14 Mar 2011 10:45:35 -0700

Xiaoqi,

You need to specify the sizes. There are other search algorithms that auotmatically pick the size (such as genetic algorithms), but I don't have those in the package yet.

Another approach is to use univariate filtering (see the sbf function in caret).

Max

On Mar 13, 2011, at 8:49 PM, Xiaoqi Cui <xcui_at_mtu.edu> wrote:

> Thanks for your prompt reply!
>
> You're right, I didn't add the parameter "importance=TRUE" when I used function "train" to fit the random forest model. Once I used the above parameter, everything went well. Also the functions "varImp" and "plot" work well too.
>
> I noticed "caret" is really good at selecting important predictors. Here I just have another question about using the package "caret" to select the best subset of predictors. As I know, the function "rfe" can be used to select the optimal set of important predictors given a series of sizes of the subsets. I'm wondering if "caret" can automatically give the best size of the selected subset without user providing the candidate sizes. Thanks,

>
> Best,
>
> Xiaoqi
> ----- Original Message -----
> From: "Max Kuhn" <mxkuhn_at_gmail.com>
> To: "Xiaoqi Cui" <xcui_at_mtu.edu>
> Cc: r-help_at_r-project.org
> Sent: Monday, March 7, 2011 2:33:06 PM GMT -06:00 US/Canada Central
> Subject: Re: [R] use "caret" to rank predictors by random forest model
>
> It would help if you provided the code that you used for the caret functions.
>
> The most likely issues is not using importance = TRUE in the call to train()
>
> I believe that I've only implemented code for plotting the varImp
> objects resulting from train() (eg. there is plot.varImp.train but not
> plot.varImp).
>
> Max
>
> On Mon, Mar 7, 2011 at 3:27 PM, Xiaoqi Cui <xcui_at_mtu.edu> wrote:

>> Hi,
>> 
>> I'm using package "caret" to rank predictors using random forest model and draw predictors importance plot. I used below commands:
>> 
>> rf.fit<-randomForest(x,y,ntree=500,importance=TRUE)
>> ## "x" is matrix whose columns are predictors, "y" is a binary resonse vector
>> ## Then I got the ranked predictors by ranking "rf1$importance[,"MeanDecreaseAccuracy"]"
>> ## Then draw the importance plot
>> varImpPlot(rf.fit)
>> 
>> As you can see, all the functions I used are directly from the package "randomForest", instead of from "caret". so I'm wondering if the package "caret" has some functions who can do the above ranking and ploting.
>> 
>> In fact, I tried functions "train", "varImp" and "plot" from package "caret", the random forest model that built by "train" can not be input correctly to "varImp", which gave error message like "subscripts out of bounds". Also function "plot" doesn't work neither.
>> 
>> So I'm wondering if anybody has encountered the same problem before, and could shed some light on this. I would really appreciate your help.
>> 
>> Thanks,
>> Xiaoqi
>> 
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 

>
>
>
> --
>
> Max


R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 14 Mar 2011 - 17:49:24 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 14 Mar 2011 - 18:00:21 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive