From: David John Allwright <allwrigh_at_maths.ox.ac.uk>

Date: Tue, 05 Jan 2010 12:19:56 +0000 (GMT)

On Fri, 18 Dec 2009, tlumley_at_u.washington.edu wrote:

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 05 Jan 2010 - 12:51:28 GMT

Date: Tue, 05 Jan 2010 12:19:56 +0000 (GMT)

Dear Thomas, Thank you, yes, that sounds good, and I take the point about
integer overflow.

Various questions:

(a) Is there some way I can try out the routine with this modification? (I
am on a Linux system where I am just a user - I cannot install new
versions of software myself) ?

(b) Is there a reference you can give me to a published paper where the
method being used to compute the p-values is described?
Many thanks,

David.

On Fri, 18 Dec 2009, tlumley_at_u.washington.edu wrote:

*>
**>
**> I've fixed this by adding 0.5/mn to q. The problem (at least in principle)
**> with multiplying them all up is integer overflow.
**>
**> By the time 0.5/mn underflows to zero, missing one value in the distribution
**> won't matter.
**>
**> -thomas
**>
**>
**> On Fri, 18 Dec 2009, David John Allwright wrote:
**>
**>> Dear Thomas, Right, thank you. Yes, I haven't looked at the source code
**>> (because I don't know C) but something like what you mention could well
**>> cause the kind of problems I am seeing: a loop being exectued one too few
**>> or one too many times. And yes, I think those quantities should be
**>> multiplied up by m*n to all become integers so we escape rounding error
**>> problems. David.
**>>
**>> ------------------------------------------------------------------------------
**>> On Wed, 16 Dec 2009, tlumley_at_u.washington.edu wrote:
**>>
**>>> On Tue, 15 Dec 2009, allwrigh_at_maths.ox.ac.uk wrote; (in part)
**>>>
**>>>>
**>>>> x<-1:5
**>>>> y<-c(2.5,4.5)
**>>>> ks.test(x,y)
**>>>>
**>>>> The value of the D_2,5 statistic is calculated as 0.4 correctly, but the
**>>>> p-value is stated by R as 1, though in fact it should be 20/21=0.9524
**>>>
**>>>
**>>> What we seem to have here is a rounding error problem.
**>>>
**>>> In ks.c:psmirnov2x, there is a double loop including
**>>> if(fabs(i / md - j / nd) > q)
**>>> u[j] = 0;
**>>>
**>>> where md=2, nd=5, and q=3/10.
**>>>
**>>> Now, to full precision abs(1/2 - 4/5) > 3/10 is false, but at least on
**>>> my MacBook it is true in C double precision.
**>>>
**>>> I'm not sure why the loop is working with doubles, since multiplying by
**>>> m*n should make everything an integer.
**>>>
**>>> -thomas
**>>>
**>>> Thomas Lumley Assoc. Professor, Biostatistics
**>>> tlumley_at_u.washington.edu University of Washington, Seattle
**>>>
**>>>
**>>>
**>>
**>
**> Thomas Lumley Assoc. Professor, Biostatistics
**> tlumley_at_u.washington.edu University of Washington, Seattle
**>
**>
**>
*

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Tue 05 Jan 2010 - 12:51:28 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Tue 05 Jan 2010 - 23:40:11 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*