Re: [R] error in random forest

From: Nagu <>
Date: Fri, 07 Mar 2008 17:27:19 -0800

Thank you very much. I'll jump in to the data and verify the consistency between the training and testing variables and their levels.

On Fri, Mar 7, 2008 at 5:14 PM, <> wrote:
> The error message is pretty clear, really. To spell it out a bit more,
> what you have done is as follows.
> Your training set has factor variables in it. Suppose one of them is
> "f". In the training set it has 5 levels, say.
> Your test set also has a factor "f", as it must, but it appears that in
> the test set it has 6 levels, or more, or levels that do not agree with
> those for "f" in the training set.
> This mismatch measn that the predict method for randomForest cannot use
> this test set.
> What you have to do is make sure that the factor levels agree for every
> factor in both test and training set. One way to do this is to put the
> test and training set together with rbind(...) say, and then separate
> them again. But even this will still have a problem for you. Because
> you training set will have some factor levels empty, which are not empty
> in the test set. The error will most likely be more subtle, though.
> You really need to sort this out yourself. It is not particularly an R
> problem, but a confusion over data. To be useful, your training set
> need to cover the field for all levels of every factor. Think about it.
> -----Original Message-----
> From: []
> On Behalf Of Nagu
> Sent: Saturday, 8 March 2008 5:37 AM
> To:;
> Subject: [R] error in random forest
> Hi,
> I get the following error when I try to predict the probabilities of a
> test sample:
> Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") :
> New factor levels not present in the training data
> I have about 630 predictor variables in the dataset x.OM (25 factor
> variables and the remaining are continuous variables). Any ideas on
> how to trace it?
> Thank you,
> Nagu
> ______________________________________________
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
> mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Sat 08 Mar 2008 - 01:31:15 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 08 Mar 2008 - 02:30:20 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive