Re: [R] what does cut(data, breaks=n) actually do?

From: Domenico Vistocco <vistocco_at_unicas.it>
Date: Thu, 13 Dec 2007 10:17:20 +0100

cut(data, breaks=n)
splits the data in n bins of (approximately) the same size.

The used size is obtained by:
max(data) - min(data)


                 n

 > x=rnorm(x)
 > cut(x,breaks=3)
 [1] (1.79,9.97] (-6.39,1.79] (9.97,18.2] (9.97,18.2] (-6.39,1.79]  [6] (1.79,9.97] (-6.39,1.79] (1.79,9.97] (-6.39,1.79] (-6.39,1.79] Levels: (-6.39,1.79] (1.79,9.97] (9.97,18.2]

Then you have:
 > 18.2-9.97
[1] 8.23
 > 9.97-1.79
[1] 8.18
 > 1.79+6.39
[1] 8.18
 >

 > (max(x)-min(x))/3
[1] 8.164187

I don't know the reasons for the little differences (I am wondering about). I hope it is useful.
domenico

melissa cline wrote:
> Hello,
>
> I'm trying to bin a quantity into 2-3 bins for calculating entropy and
> mutual information. One of the approaches I'm exploring is the cut()
> function, which is what the mutualInfo function in binDist uses. When it's
> called in the format cut(data, breaks=n), it somehow splits the data into n
> distinct bins. Can anyone tell me how cut() decides where to cut?
>
> Thanks,
>
> Melissa
>
>
>
> ---------------------------------------------------------------
> Melissa Cline, Independent Investigator
> MCD Biology, UCSC
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 13 Dec 2007 - 09:28:01 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 13 Dec 2007 - 09:30:19 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.