Re: [R] sampsize in Random Forests

From: Federico <fedeabascal_at_gmail.com>
Date: Tue, 15 Apr 2008 03:57:26 -0700 (PDT)

...

On 10 mar, 17:00, "Liaw, Andy" <andy_l..._at_merck.com> wrote:
> Are you sure there are 100 sites in your data? Here's an example:
>
> R> library(randomForest)randomForest4.5-23
> Type rfNews() to see new features/changes/bug fixes.
> R> f <- factor(sample(1:4, nrow(iris), replace=TRUE))
> R> rf1 <-randomForest(iris[1:4], iris[[5]], strata=f,sampsize=rep(5,
> nlevels(f)))
> R> rf1
>
> Call:
> randomForest(x = iris[1:4], y = iris[[5]], strata = f,sampsize=
> rep(5, nlevels(f)))
> Type of random forest: classification
> Number of trees: 500
> No. of variables tried at each split: 2
>
> OOB estimate of error rate: 4.67%
> Confusion matrix:
> setosa versicolor virginica class.error
> setosa 50 0 0 0.00
> versicolor 0 47 3 0.06
> virginica 0 4 46 0.08
>
>
>
> > -----Original Message-----
> > From: r-help-boun..._at_r-project.org
> > [mailto:r-help-boun..._at_r-project.org] On Behalf Of Naiara Pinto
> > Sent: Sunday, March 09, 2008 5:19 PM
> > To: r-h..._at_r-project.org
> > Subject: [R]sampsizein Random Forests
>
> > Hi all,
>
> > I have a dataset where each point is assigned to a class A, B, C, or
> > D. Each point is also assigned to a study site. Each study site is
> > coded with a number ranging between 1-100. This information is stored
> > in the vector studySites.
>
> > I want to run randomForests using stratified sampling, so I
> > chose the option
> > strata = factor(studySites)
>
> > But I am not sure how to control the number of samples taken from each
> > study site. I tried to use 10 points from each study site:
> > mySampSize = rep(10, 100)
>
> > So my function call looks like:
> > RF =randomForest(myClass~., data=myData, mtry=5, importance=TRUE,
> > strata = factor(studySites),sampsize=mySampSize)
>
> > ButrandomForestgives me the following error:
> > Error inrandomForest.default(m, y, ...) :
> >sampsizecan not be larger than class frequency
>
> > Does anybody have any idea why this happens?
>
> > Thank you very much,
>
> > Naiara.
>
> > ______________________________________________
> > R-h..._at_r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ------------------------------------------------------------------------------
> Notice: This e-mail message, together with any attachme...{{dropped:15}}
>
> ______________________________________________
> R-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 15 Apr 2008 - 11:29:32 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 15 Apr 2008 - 11:30:28 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive