Re: [R] trouble with wilcox.test

From: P Ehlers <ehlers_at_math.ucalgary.ca>
Date: Thu 18 Aug 2005 - 17:37:50 EST

Prof Brian Ripley wrote:
> On Wed, 17 Aug 2005, Greg Hather wrote:
>
>

>>I'm having trouble with the wilcox.test command in R.

>
>
> Are you sure it is not the concepts that are giving 'trouble'?
> What real problem are you trying to solve here?
>
>
>>To demonstrate the anomalous behavior of wilcox.test, consider
>>
>>
>>>wilcox.test(c(1.5,5.5), c(1:10000), exact = F)$p.value
>>
>>[1] 0.01438390
>>
>>>wilcox.test(c(1.5,5.5), c(1:10000), exact = T)$p.value
>>
>>[1] 6.39808e-07 (this calculation takes noticeably longer).
>>
>>>wilcox.test(c(1.5,5.5), c(1:20000), exact = T)$p.value
>>
>>(R closes/crashes)
>>
>>I believe that wilcox.test(c(1.5,5.5), c(1:10000), exact = F)$p.value 
>>yields a bad result because of the normal approximation which R uses 
>>when exact = F.

>
>
> Expecting an approximation to be good in the tail for m=2 is pretty
> unrealistic. But then so is believing the null hypothesis of a common
> *continuous* distribution. Why worry about the distribution under a
> hypothesis that is patently false?
>
> People often refer to this class of tests as `distribution-free', but they
> are not. The Wilcoxon test is designed for power against shift
> alternatives, but here there appears to be a very large difference in
> spread. So
>
>
>>wilcox.test(5000+c(1.5,5.5), c(1:10000), exact = T)$p.value

>
> [1] 0.9989005
>
> even though the two samples differ in important ways.
>
>
>
>>Any suggestions for how to compute 
>>wilcox.test(c(1.5,5.5), c(1:20000), exact = T)$p.value?

>
>
> I get (current R 2.1.1 on Linux)
>
>
>>wilcox.test(c(1.5,5.5), c(1:20000), exact = T)$p.value

>
> [1] 1.59976e-07
>
> and no crash. So the suggestion is to use a machine adequate to the task,
> and that probably means an OS with adequate stack size.
>
>
>>	[[alternative HTML version deleted]]

>
>
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

>
>
> Please do heed it. What version of R and what machine is this? And do
> take note of the request about HTML mail.
>

One could also try wilcox.exact() in package exactRankTests (0.8-11) which also gives (with suitable patience)

[1] 1.59976e-07

even on my puny 256M Windows laptop.

Still, it might be worthwhile adding a "don't do something this silly" error message to wilcox.test() rather than having it crash R. Low priority, IMHO.

Windows XP SP2
"R version 2.1.1, 2005-08-11"

Peter Ehlers



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Thu Aug 18 17:42:15 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:39:51 EST