From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>

Date: Thu 13 Jul 2006 - 08:01:07 EST

Date: Thu 13 Jul 2006 - 08:01:07 EST

<Matthew.Findley@ch2m.com> writes:

> R Users:

*>
**> My question is probably more about elementary statistics than the
**> mechanics of using R, but I've been dabbling in R (version 2.2.0) and
**> used it recently to test some data .
**>
**> I have a relatively small set of observations (n = 12) of arsenic
**> concentrations in background groundwater and wanted to test my
**> assumption of normality. I used the Shapiro-Wilk test (by calling
**> shapiro.test() in R) and I'm not sure how to interpret the output.
**> Here's the input/output from the R console:
**>
**> >As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13)
**> >shapiro.test(As)
**>
**> Shapiro-Wilk normality test
**>
**> data: As
**> W = 0.9513, p-value = 0.6555
**>
**> How do I interpret this? I understand, from poking around the internet,
**> that the higher the W statistic the "more normal" the data.
**>
**> What is the null hypothesis - that the data is normally distributed?
*

Yup.

> What does the p-value tell me? 65.55% chance of what - getting

*> W-statistic greater than or equal to 0.9513 (I picked this up from the
**> Dalgaard book, Introductory Statistics with R, but its not really
**> sinking in with respect to how it applies to a Shipiro Wilk test).?
*

*Smaller* or equal - W=1.0 is the "perfect fit". The W statistic is pretty much the Pearson correlation applied to the curve drawn by qqnorm(). (The exact definition of what goes on the x axis differs slightly, I believe.)

A low p-value would indicate that the W is too extreme to be explained by chance variation - i.e. evidence against normal distribution. In the present case you have no evidence against normal distribution (beware that this is not evidence _for_ normality).

(Personally, I'm not too happy about these normality tests. They tend to lack power in small samples and in large samples they often reject distributions which are perfectly adequate for normal-theory analysis. Learning to evaluate a QQ plot seems a better idea.)

> The method description - retrieved using ?shapiro.test() - is a bit

*> light on details.
*

There are references therein, though...

-- O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Thu Jul 13 08:07:10 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Thu 13 Jul 2006 - 10:17:48 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*