From: Weiwei Shi <helprhelp_at_gmail.com>

Date: Fri 08 Jul 2005 - 05:47:16 EST

Date: Fri 08 Jul 2005 - 05:47:16 EST

it works.

thanks,

but: (just curious)

why i tried previously and i got

> is.vector(sample.size)

**[1] TRUE
**
i also tried as.vector(sample.size) and assigned it to sampsz,it still
does not work.

On 7/7/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:

> On 7/7/2005 3:38 PM, Weiwei Shi wrote:

*> > Hi there:
**> > I have a question on random foresst:
**> >
**> > recently i helped a friend with her random forest and i came with this problem:
**> > her dataset has 6 classes and since the sample size is pretty small:
**> > 264 and the class distr is like this (Diag is the response variable)
**> > sample.size <- lapply(1:6, function(i) sum(Diag==i))
**> >> sample.size
**> > [[1]]
**> > [1] 36
**> >
**> > [[2]]
**> > [1] 12
**> >
**> > [[3]]
**> > [1] 120
**> >
**> > [[4]]
**> > [1] 36
**> >
**> > [[5]]
**> > [1] 30
**> >
**> > [[6]]
**> > [1] 30
**> >
**> > I assigned this sample.size to sampsz for a stratiefied sampling
**> > purpose and i got the following error:
**> > Error in sum(..., na.rm = na.rm) : invalid 'mode' of argument
**> >
**> > if I use sampsz=c(36, 12, 120, 36, 30, 30), then it is fine. Could you
**> > tell me why?
**>
**> The sum() function knows what to do on a vector, but not on a list. You
**> can turn your sample.size variable into a vector using
**>
**> unlist(sample.size)
**>
**> Duncan Murdoch
**>
**> > btw, as to classification problem for this with uneven class number
**> > situation, do u have some suggestions to improve its accuracy? I
**> > tried to use c() way to make the sampsz works but the result is
**> > similar.
**> >
**> > Thanks,
**> >
**> > weiwei
**> >
**> > On 6/30/05, Liaw, Andy <andy_liaw@merck.com> wrote:
**> >> The limitation comes from the way categorical splits are represented in the
**> >> code: For a categorical variable with k categories, the split is
**> >> represented by k binary digits: 0=right, 1=left. So it takes k bits to
**> >> store each split on k categories. To save storage, this is `packed' into a
**> >> 4-byte integer (32-bit), thus the limit of 32 categories.
**> >>
**> >> The current Fortran code (version 5.x) by Breiman and Cutler gets around
**> >> this limitation by storing the split in an integer array. While this lifts
**> >> the 32-category limit, it takes much more memory to store the splits. I'm
**> >> still trying to figure out a more memory efficient way of storing the splits
**> >> without imposing the 32-category limit. If anyone has suggestions, I'm all
**> >> ears.
**> >>
**> >> Best,
**> >> Andy
**> >>
**> >> > From: Arne.Muller@sanofi-aventis.com
**> >> >
**> >> > Hello,
**> >> >
**> >> > I'm using the random forest package. One of my factors in the
**> >> > data set contains 41 levels (I can't code this as a numeric
**> >> > value - in terms of linear models this would be a random
**> >> > factor). The randomForest call comes back with an error
**> >> > telling me that the limit is 32 categories.
**> >> >
**> >> > Is there any reason for this particular limit? Maybe it's
**> >> > possible to recompile the module with a different cutoff?
**> >> >
**> >> > thanks a lot for your help,
**> >> > kind regards,
**> >> >
**> >> >
**> >> > Arne
**> >> >
**> >> > ______________________________________________
**> >> > R-help@stat.math.ethz.ch mailing list
**> >> > https://stat.ethz.ch/mailman/listinfo/r-help
**> >> > PLEASE do read the posting guide!
**> >> > http://www.R-project.org/posting-guide.html
**> >> >
**> >> >
**> >> >
**> >>
**> >> ______________________________________________
**> >> R-help@stat.math.ethz.ch mailing list
**> >> https://stat.ethz.ch/mailman/listinfo/r-help
**> >> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
**> >>
**> >
**> >
**>
**>
*

-- Weiwei Shi, Ph.D "Did you always know?" "No, I did not. But I believed..." ---Matrix III ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Fri Jul 08 06:08:17 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:33:21 EST
*