Re: [R] To get more digits in precision of predict function of randomForests

From: Uwe Ligges <ligges_at_statistik.tu-dortmund.de>
Date: Mon, 25 Feb 2008 18:31:32 +0100

Nagu wrote:
> Thank you Uwe Ligges.
>
> Yes. I had only 50 trees. I come across memory problems running for
> big number of trees. Also, I am going to post my next question in a
> separate thread, but, it does not harm me to ask here. How do I deal
> with large datasets when using randomForests. I have approximately,
> datasets of size 500000X650, and R just can't deal with it (pops up
> memory allocation problems).

If you want to use all variables at the same time (otherwise use data base access), you will get into troubles with less than 4 Gb of RAM or so, but it might work well on some 32 Gb machine, I guess.

 > Are there any better ways to deal with
> large datasets in R, for example, Splus had something like bigData
> library.

bigData library only works for some methods such as lm/glm, but not with random forests.

Uwe Ligges

>
> Thank you,
> Nagu
>
> On Mon, Feb 25, 2008 at 1:56 AM, Uwe Ligges
> <ligges_at_statistik.tu-dortmund.de> wrote:

>>
>>
>>  Nagu wrote:
>>  > Hi,
>>  >
>>  > I am using randomForests for a classification problem. The predict
>>  > function in the randomForest library, when asked to return the
>>  > probabilities, has precision of two digits after the decimal. I need
>>  > at least four digits of precision for the predicted probabilities. How
>>  > do I achieve this?
>>
>>  For me it gives the desired precision, adapting the
>>  ?predict.randomForest example:
>>
>>  data(iris)
>>  set.seed(111)
>>  ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2))
>>  iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,], ntree = 2000)
>>  iris.pred <- predict(iris.rf, iris[ind == 2,], type = "prob")
>>  iris.pred
>>
>>  Maybe you do not have much more than 1000 trees in your bag?
>>
>>  Uwe Ligges
>>
>>
>>
>>
>>
>>
>>
>>  >
>>  > Thank you,
>>  > Nagu
>>  >
>>  > ______________________________________________
>>  > R-help_at_r-project.org mailing list
>>  > https://stat.ethz.ch/mailman/listinfo/r-help
>>  > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>  > and provide commented, minimal, self-contained, reproducible code.
>>

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 25 Feb 2008 - 17:35:44 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 25 Feb 2008 - 19:30:22 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive