Re: [R] Wilcoxon signed rank test and its requirements

From: Atte Tenkanen <attenka_at_utu.fi>
Date: Sat, 26 Jun 2010 04:50:01 +0300

Greg Snow kirjoitti 25.6.2010 kello 21.55:

> Let me see if I understand. You actually have the data for the
> whole population (the entire piece) but you have some pre-defined
> sections that you want to see if they differ from the population,
> or more meaningfully they are different from a randomly selected
> set of measures. Is that correct?

Exactly.

>
> If so, since you have the entire population of interest you can
> create the actual sampling distribution (or a good approximation of
> it). Just take random samples from the population of the given
> size (matching the subset you are interested in) and calculate the
> means (or other value of interest), probably 10,000 to 1,000,000
> samples. Now compare the value from your predefined subset to the
> set of random values you generated to see if it is in the tail or not.

Thank you! I will do this.

Is this kind of !Monte Carlo -evaluation (?)" often used in statistics.If it is, do you know any reference for ti?

Atte

>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow_at_imail.org
> 801.408.8111
>
>
>> -----Original Message-----
>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
>> project.org] On Behalf Of Atte Tenkanen
>> Sent: Thursday, June 24, 2010 11:04 PM
>> To: David Winsemius
>> Cc: R mailing list
>> Subject: Re: [R] Wilcoxon signed rank test and its requirements
>>
>> The values come from this kind of process:
>> The musical composition is segmented into so-called 'pitch-class
>> segments' and these segments are compared with one reference set
>> with a
>> distance function. Only some distance values are possible. These
>> distance values can be averaged over music bars which produces
>> smoother
>> distribution and the 'comparison curve' that illustrates the
>> distances
>> according to the reference set through a musical piece result in more
>> readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ),
>> but I
>> would prefer to use original values.
>>
>> then, I want to pick only some regions from the piece and compare
>> those
>> values of those regions, whether they are higher than the mean of all
>> values.
>>
>> Atte
>>
>>> On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote:
>>>
>>>> Is there anything for me?
>>>>
>>>> There is a lot of data, n=2418, but there are also a lot of ties.
>>>> My sample nĹ250-300
>>>>
>>>
>>> I do not understand why there should be so many ties. You have not
>>> described the measurement process or units. ( ... although you offer
>> a
>>>
>>> glipmse without much background later.)
>>>
>>>> i would like to test, whether the mean of the sample differ
>>>> significantly from the population mean.
>>>
>>> Why? What is the purpose of this investigation? Why should the mean
>> of
>>>
>>> a sample be that important?
>>>
>>>>
>>>> The histogram of the population looks like in attached histogram,
>>>> what test should I use? No choices?
>>>>
>>>> This distribution comes from a musical piece and the values are
>>>> 'tonal distances'.
>>>>
>>>> http://users.utu.fi/attenka/Hist.png
>>>
>>> That picture does not offer much insidght into the features of that
>>> measurement. It appears to have much more structure than I would
>>> expect for a sample from a smooth unimodal underlying population.
>>>
>>> --
>>> David.
>>>
>>>>
>>>> Atte
>>>>
>>>>> On 06/24/2010 12:40 PM, David Winsemius wrote:
>>>>>>
>>>>>> On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote:
>>>>>>
>>>>>>> Thanks. What I have had to ask is that
>>>>>>>
>>>>>>> how do you test that the data is symmetric enough?
>>>>>>> If it is not, is it ok to use some data transformation?
>>>>>>>
>>>>>>> when it is said:
>>>>>>>
>>>>>>> "The Wilcoxon signed rank test does not assume that the data are
>>>>>>> sampled from a Gaussian distribution. However it does assume
>> that
>>>
>>>>>>> the
>>>>>>> data are distributed symmetrically around the median. If the
>>>>>>> distribution is asymmetrical, the P value will not tell you much
>>>
>>>>>>> about
>>>>>>> whether the median is different than the hypothetical value."
>>>>>>
>>>>>> You are being misled. Simply finding a statement on a statistics
>>>>>> software website, even one as reputable as Graphpad (???), does
>> not
>>>>> mean
>>>>>> that it is necessarily true. My understanding (confirmed
>> reviewing
>>>>>> "Nonparametric statistical methods for complete and censored
>> data"
>>>>> by M.
>>>>>> M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank
>> test
>>>>> does
>>>>>> not require that the underlying distributions be symmetric. The
>>>>>> above
>>>>>> quotation is highly inaccurate.
>>>>>>
>>>>>
>>>>> To add to what David and others have said, look at the kernel that
>>>
>>>>> the
>>>>>
>>>>> U-statistic associated with the WSR test uses: the indicator (0/1)
>>> of
>>>>> xi
>>>>> + xj > 0. So WSR tests H0:p=0.5 where p = the probability that
>> the
>>>>> average of a randomly chosen pair of values is positive. [If
>> there
>>>>> are
>>>>> ties this probably needs to be worded as P[xi + xj > 0] = P[xi +
>> xj
>>> <
>>>>>
>>>>> 0], i neq j.
>>>>>
>>>>> Frank
>>>>>
>>>>> --
>>>>> Frank E Harrell Jr Professor and Chairman School of
>> Medicine
>>>>> Department of Biostatistics Vanderbilt
>>>>> University
>>>
>>
>> ______________________________________________
>> R-help_at_r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 26 Jun 2010 - 09:39:50 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 26 Jun 2010 - 09:40:36 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive