Re: [R] About normality tests...

From: Peter Ehlers <>
Date: Wed, 23 Jun 2010 12:35:03 -0600

On 2010-06-23 12:05, Ralf B wrote:
> Hi all,
> I have two very large samples of data (10000+ data points) and would
> like to perform normality tests on it. I know that p< .05 means that
> a data set is considered as not normal with any of the two tests. I am
> also aware that large samples tend to lead more likely to normal
> results (Andy Field, 2005).

I that depends on what you mean by 'tend to lead ...'

> I have a few questions to ensure that I am using them right.
> 1) The Shapiro-Wilk test requires to provide mean and sd. Is is
> correct to add here the mean and sd of the data itself (since I am
> comparing to a normal distribution with the same parameters) ?
> mySD<- sd(mydata$myfield)
> myMean<- mean(mydata$myfield)
> shapiro.test(rnorm(100, mean = myMean, sd = mySD))

I don't think that your understanding of the S-W test is correct. You would just do:


to test for Normality. However, shapiro.test() won't accept sample sizes greater than 5000. So use ks.test. Or use a graphical method: I like qq.plot in the 'car' package.

> 2) If I just want to test each distribution individually, I assume
> that I am doing a one-sample Kolmogorov-Smirnov test. Is that correct?

I don't understand this. What do you mean by 'test ... individually'?

> 3) If I simply want to know if normality exists or not, what should I
> put for the parameter 'alternative' ? Does it actually matter?
> alternative = c("two.sided", "less", "greater")

Leave it at the default 'two.sided' unless you have good reason to suspect that the cdf lies above or below the Normal cdf.

   -Peter Ehlers

> Thank you,
> Ralf
> mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Wed 23 Jun 2010 - 18:37:52 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 23 Jun 2010 - 19:10:33 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive