Re: [R] randomForest.error: length of response must be the same as predictors

From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Thu, 03 Jul 2008 09:50:22 +0100

On Thu, 2008-07-03 at 12:11 +0530, Soumyadeep Nandi wrote:
> My data looks like:
> A,B,C,D,Class
> 1,2,0,2,cl1
> 1,5,1,9,cl1
> 3,2,1,2,cl2
> 7,2,1,2,cl2
> 2,2,1,2,cl2
> 1,2,1,5,cl2
> 0,2,1,2,cl2
> 4,2,1,2,cl2
> 3,5,1,2,cl2
> 3,2,12,3,cl2
> 3,2,4,2,cl2
>
> **The steps followed are:
> trainfile <- read.csv("TrainFile",head=TRUE)
> datatrain <- subset(trainfile,select=c(-Class))
> classtrain <- (subset(trainfile,select=Class))
> rf <- randomForest(datatrain, classtrain)
>
> Error in randomForest.default(classtrain, datatrain) :
> length of response must be the same as predictors
> In addition: Warning message:
> In randomForest.default(classtrain, datatrain) :
> The response has five or fewer unique values. Are you sure you want to do
> regression?
>
> Could someone suggest me where I am going wrong.

Yep, look at class(classtrain):

> class(classtrain)

[1] "data.frame"

subset() returns a data.frame, which is a special case of a list. The lengths of a list (and therefore a data frame) are not what you expect:

> length(classtrain)

[1] 1

There is *1* component to the list, one '$' bit that you can get at. Hence, rf complains as, to it, the length of x and y are not the same, when evaluated using length().

Note that ?randomForest does state that y should be a response 'vector', so you are not supplying what is required.

Two ways to proceed:

rf <- randomForest(Class ~ ., data = trainfile)

or if you really don't want the formula parsing, force the empty dimension to be dropped, by subsetting:

rf <- randomForest(datatrain, classtrain[,1])

[Nb, as classtrain is of class "data.frame", drop() will not work on it as it doesn't have a dim attribute]

HTH G

>
> Thanks
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 Jul 2008 - 09:16:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 Jul 2008 - 12:31:52 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive