Re: [R] Measuring dispersion

From: S. Nunes <snunes_at_gmail.com>
Date: Wed, 18 Jun 2008 00:10:18 +0100

Thanks for the suggestion, however I'm looking for a score since my goal is to rank thousands of distributions. For instance, given a large text, I would like to rank all terms according to their distribution (dispersion) within the text.

Terms evenly distributed in the text should have a low score. Terms following an uneven distribution should rank higher.

Thanks again,

--
SÚrgio Nunes

2008/6/17 Moshe Olshansky <m_olshansky_at_yahoo.com>:

> You could also look at the difference between your empirical distribution and the uniform distribution (something like Kolmogorov-Smirnov test).
>
>
> --- On Tue, 17/6/08, S. Nunes <snunes_at_gmail.com> wrote:
>
>> From: S. Nunes <snunes_at_gmail.com>
>> Subject: [R] Measuring dispersion
>> To: r-help_at_stat.math.ethz.ch
>> Received: Tuesday, 17 June, 2008, 7:56 PM
>> Hi,
>>
>> I'm looking for a function to measure the dispersion of
>> a set of
>> values ranging from 0 to 1.
>> This function should be 0 if all the values are evenly
>> spaced within
>> the interval and it should be > 0 if values are
>> clustered.
>> The more clustered the values are, the higher should the
>> function be.
>>
>> An example:
>>
>> [0; 0.2; 0.4; 0.6; 0.8; 1] - function should be ~ 0
>> [0; 0.1; 0.1; 0.15; 1] - function should be > 1
>>
>> This data comes from time-dependent observations recorded
>> between a
>> start time (0) and an end time (1).
>> I want to find out which series are more clustered, i.e.
>> less evenly
>> distributed.
>>
>> I'm going to test Kurtosis for this but it doesn't
>> seem to be the best
>> tool for the job.
>> As I understand, Kurtosis evaluates the
>> "strength" of a single central
>> peak. My data can have multiple peaks (clusters).
>>
>> Thanks in advance for your comments,
>> --
>> SÚrgio Nunes
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>> reproducible code.
>
______________________________________________ R-help_at_r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Tue 17 Jun 2008 - 23:55:13 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 18 Jun 2008 - 12:30:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive