Re: [R] regression towards the mean, AS paper November 2007

From: Kevin Wright <>
Date: Tue, 18 Dec 2007 11:02:51 -0600

On Dec 17, 2007 3:10 PM, hadley wickham <> wrote:
> > This has nothing to do really with the question that Troels asked,
> > but the exposition quoted from the AA paper is unnecessarily confusing.
> > The phrase ``Because X0 and X1 have identical marginal
> > distributions ...''
> > throws the reader off the track. The identical marginal distributions
> > are irrelevant. All one needs is that the ***means*** of X0 and X1
> > be the same, and then the null hypothesis tested by a paired t-test
> > is true and so the p-values are (asymptotically) Uniform[0,1]. With
> > a sample size of 100, the ``asymptotically'' bit can be safely ignored
> > for any ``decent'' joint distribution of X0 and X1. If one further
> > assumes that X0 - X1 is Gaussian (which has nothing to do with X0 and
> > X1 having identical marginal distributions) then ``asymptotically''
> > turns into ``exactly''.
> Another related issue is that uniform distributions don't look very uniform:
> hist(runif(100))
> hist(runif(1000))
> hist(runif(10000))
> Be sure to calibrate your eyes (and your bin width) before rejecting
> the hypothesis that the distribution is uniform.
> Hadley

Thanks for the example, Hadley. To me, this suggests we should stop teaching histograms in Stat 101 and instead use quantile plots, which give excellent results for n=100 and even surprisingly good results for n=10:

for(i in c(10, 100, 1000, 10000)) {
  qqplot(runif(i), qunif(seq(1/i, 1, length=i)), main=i,

         xlim=c(0,1), ylim=c(0,1),
         xlab="runif", ylab="Uniform distribution quantiles")

Kevin (drifting even further off topic) mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Tue 18 Dec 2007 - 17:07:26 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 18 Dec 2007 - 18:30:21 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.