From: Atte Tenkanen <attenka_at_utu.fi>

Date: Sat, 26 Jun 2010 05:15:06 +0300

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 26 Jun 2010 - 09:47:14 GMT

Date: Sat, 26 Jun 2010 05:15:06 +0300

Greg Snow kirjoitti 25.6.2010 kello 21.55:

> Let me see if I understand. You actually have the data for the

*> whole population (the entire piece) but you have some pre-defined
**> sections that you want to see if they differ from the population,
**> or more meaningfully they are different from a randomly selected
**> set of measures. Is that correct?
**>
**> If so, since you have the entire population of interest you can
**> create the actual sampling distribution (or a good approximation of
**> it). Just take random samples from the population of the given
**> size (matching the subset you are interested in) and calculate the
**> means (or other value of interest), probably 10,000 to 1,000,000
**> samples. Now compare the value from your predefined subset to the
**> set of random values you generated to see if it is in the tail or not.
*

I check, so you mean doing it this way:

t.test(sample(POPUL, length(SAMPLE), replace = FALSE), mu=mean (SAMPLE), alt = "less")

Atte

*>
**> --
*

> Gregory (Greg) L. Snow Ph.D.

*> Statistical Data Center
**> Intermountain Healthcare
**> greg.snow_at_imail.org
**> 801.408.8111
**>
**>
**>> -----Original Message-----
**>> From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-
**>> project.org] On Behalf Of Atte Tenkanen
**>> Sent: Thursday, June 24, 2010 11:04 PM
**>> To: David Winsemius
**>> Cc: R mailing list
**>> Subject: Re: [R] Wilcoxon signed rank test and its requirements
**>>
**>> The values come from this kind of process:
**>> The musical composition is segmented into so-called 'pitch-class
**>> segments' and these segments are compared with one reference set
**>> with a
**>> distance function. Only some distance values are possible. These
**>> distance values can be averaged over music bars which produces
**>> smoother
**>> distribution and the 'comparison curve' that illustrates the
**>> distances
**>> according to the reference set through a musical piece result in more
**>> readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ),
**>> but I
**>> would prefer to use original values.
**>>
**>> then, I want to pick only some regions from the piece and compare
**>> those
**>> values of those regions, whether they are higher than the mean of all
**>> values.
**>>
**>> Atte
**>>
**>>> On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote:
**>>>
**>>>> Is there anything for me?
**>>>>
**>>>> There is a lot of data, n=2418, but there are also a lot of ties.
**>>>> My sample nÅ250-300
**>>>>
**>>>
**>>> I do not understand why there should be so many ties. You have not
**>>> described the measurement process or units. ( ... although you offer
**>> a
**>>>
**>>> glipmse without much background later.)
**>>>
**>>>> i would like to test, whether the mean of the sample differ
**>>>> significantly from the population mean.
**>>>
**>>> Why? What is the purpose of this investigation? Why should the mean
**>> of
**>>>
**>>> a sample be that important?
**>>>
**>>>>
**>>>> The histogram of the population looks like in attached histogram,
**>>>> what test should I use? No choices?
**>>>>
**>>>> This distribution comes from a musical piece and the values are
**>>>> 'tonal distances'.
**>>>>
**>>>> http://users.utu.fi/attenka/Hist.png
**>>>
**>>> That picture does not offer much insidght into the features of that
**>>> measurement. It appears to have much more structure than I would
**>>> expect for a sample from a smooth unimodal underlying population.
**>>>
**>>> --
**>>> David.
**>>>
**>>>>
**>>>> Atte
**>>>>
**>>>>> On 06/24/2010 12:40 PM, David Winsemius wrote:
**>>>>>>
**>>>>>> On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote:
**>>>>>>
**>>>>>>> Thanks. What I have had to ask is that
**>>>>>>>
**>>>>>>> how do you test that the data is symmetric enough?
**>>>>>>> If it is not, is it ok to use some data transformation?
**>>>>>>>
**>>>>>>> when it is said:
**>>>>>>>
**>>>>>>> "The Wilcoxon signed rank test does not assume that the data are
**>>>>>>> sampled from a Gaussian distribution. However it does assume
**>> that
**>>>
**>>>>>>> the
**>>>>>>> data are distributed symmetrically around the median. If the
**>>>>>>> distribution is asymmetrical, the P value will not tell you much
**>>>
**>>>>>>> about
**>>>>>>> whether the median is different than the hypothetical value."
**>>>>>>
**>>>>>> You are being misled. Simply finding a statement on a statistics
**>>>>>> software website, even one as reputable as Graphpad (???), does
**>> not
**>>>>> mean
**>>>>>> that it is necessarily true. My understanding (confirmed
**>> reviewing
**>>>>>> "Nonparametric statistical methods for complete and censored
**>> data"
**>>>>> by M.
**>>>>>> M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank
**>> test
**>>>>> does
**>>>>>> not require that the underlying distributions be symmetric. The
**>>>>>> above
**>>>>>> quotation is highly inaccurate.
**>>>>>>
**>>>>>
**>>>>> To add to what David and others have said, look at the kernel that
**>>>
**>>>>> the
**>>>>>
**>>>>> U-statistic associated with the WSR test uses: the indicator (0/1)
**>>> of
**>>>>> xi
**>>>>> + xj > 0. So WSR tests H0:p=0.5 where p = the probability that
**>> the
**>>>>> average of a randomly chosen pair of values is positive. [If
**>> there
**>>>>> are
**>>>>> ties this probably needs to be worded as P[xi + xj > 0] = P[xi +
**>> xj
**>>> <
**>>>>>
**>>>>> 0], i neq j.
**>>>>>
**>>>>> Frank
**>>>>>
**>>>>> --
**>>>>> Frank E Harrell Jr Professor and Chairman School of
**>> Medicine
**>>>>> Department of Biostatistics Vanderbilt
**>>>>> University
**>>>
**>>
**>> ______________________________________________
**>> R-help_at_r-project.org mailing list
**>> https://stat.ethz.ch/mailman/listinfo/r-help
**>> PLEASE do read the posting guide http://www.R-project.org/posting-
**>> guide.html
**>> and provide commented, minimal, self-contained, reproducible code.
*

[[alternative HTML version deleted]]

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 26 Jun 2010 - 09:47:14 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 26 Jun 2010 - 10:00:35 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*