Re: [R] Normality test

From: Greg Snow <Greg.Snow_at_imail.org>
Date: Sat, 28 May 2011 14:21:56 -0600

To build on Robert's suggestion (which is very good to begin with), you might consider using the vis.test function in the TeachingDemos package with the vt.qqnorm function. This will create the qq plot of your data along with several other qqplots of normal samples of the same size. If you cannot tell which of the plots is your data, then your data is probably close enough to normal for most practical purposes. It will give you a p-value based on your ability to distinguish your data from random normals if you need one.

If you need more precision, then the most precise normality test is SnowsPenultimateNormalityTest also in TeachingDemos. However, the documentation for that function tends to be more useful than the function itself.

If you really want to choose among the different normality tests in nortest (or elsewhere) then you should really investigate what assumptions they are making and what types of alternatives they are the most powerful for. Also decide on what types of non-normality you really care about, then use that to choose among them. Consider the 2 distributions where one is uniform between 0 and 1 with height 1; the other also has height 1 between 0 and 0.99, but is also 1 between 999.99 and 1000, zero elsewhere. Are these 2 distributions different in a meaningful way? They have very different mean and variance, but for most samples they will look the same (and if you throw out outliers they will look even more similar). The reason that different tests give different results is because they focus on different types of differences.

-----Original Message-----
From: r-help-bounces_at_r-project.org [mailto:r-help-bounces_at_r-project.org] On Behalf Of Robert Baer Sent: Friday, May 27, 2011 5:28 PM
To: Salil Sharma; R-help_at_r-project.org
Subject: Re: [R] Normality test

> I am writing to inquire about normality test given in nortest package. I
> have a random data set consisting of 300 samples. I am curious about which
> normality test in R would give me precise measurement, whether data sample
> is following normal distribution. As p value in each test is different in
> each test, if you could help me identifying a suitable test in R for this
> medium size of data, it will be grateful.

I am neither a statistician nor an expert on these types of tests, but I'm guessing that your are unlikely to get a good answer even from people with such qualifications as such judgments can only be made in the context of a specific problem. You have not provided us with such a problem (please read the posting guide).

That admonishment aside, I typically start by using qqnorm() and qqline() to plot my data against the expected theoretical quantiles. If your data is perfectly normal, the points will fall right along the line. Skewness and deviations from normal by the tails produce very characteristic patterns in the plots which you can learn about by plotting some simulated data that is left-skewed, right-skewed, long tailed, or short tailed.

I personally find this graphical feedback to be a much more useful way to understand my data than doing a single normality test that produces a p-value. based upon assumptions I may not be privy to

For more, see the help by typing:
?qqnorm
?qqline

Rob



Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine A. T. Still University of Health Sciences 800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 28 May 2011 - 20:24:18 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 29 May 2011 - 06:40:10 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive