Matthew,

You may find the following documents useful if your venture into environmental statistics is serious.

First, the 92 EPA Addendum on GW statistics--links at http://www.epa.gov/correctiveaction/resource/guidance/sitechar/gwstats/gwstats.htm

The second is Helsel's book at the USGS

http://pubs.usgs.gov/twri/twri4a3/

Both documents have good discussions on normality tests for GW data including probability plot correlation coefficients and variations in the (x) plotting position--Blom, Cunane, etc.

Helsel is a good read 1.) his writing is so clear in his writing, 2.) he gets into nonparametric approaches in so many areas of GW stats, and 3.) the typography is nice--the book just a pleasant experience all around. Just be advised this is only the beginning...

Oh, yes. It ain't safe to just dabble with environmental (contaminant)data--it is too messy. Go whole hog or pass it up.

Best regards,

Michael Grant (works for the competition :O))

Peter Dalgaard <p.dalgaard@biostat.ku.dk> wrote:

<Matthew.Findley@ch2m.com> writes:

*>
> R Users:
**> >
**> > My question is probably more about elementary statistics than the
> mechanics of using R, but I've been dabbling in R (version 2.2.0) and
> used it recently to test some data .
**> >
> I have a relatively small set of observations (n = 12) of arsenic
> concentrations in background groundwater and wanted to test my
> assumption of normality. I used the Shapiro-Wilk test (by calling
> shapiro.test() in R) and I'm not sure how to interpret the output.
> Here's the input/output from the R console:
**> >
> >As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13)
> >shapiro.test(As)
**> >
> Shapiro-Wilk normality test
**> >
> data: As
> W = 0.9513, p-value = 0.6555
**> >
> How do I interpret this? I understand, from poking around the internet,
> that the higher the W statistic the "more normal" the data.
**> >
> What is the null hypothesis - that the data is normally distributed?
**>
Yup.
**>
> What does the p-value tell me? 65.55% chance of what - getting
> W-statistic greater than or equal to 0.9513 (I picked this up from the
> Dalgaard book, Introductory Statistics with R, but its not really
> sinking in with respect to how it applies to a Shipiro Wilk test).?
**>
*Smaller* or equal - W=1.0 is the "perfect fit". The W statistic is
pretty much the Pearson correlation applied to the curve drawn by
qqnorm(). (The exact definition of what goes on the x axis differs
slightly, I believe.)
**>
A low p-value would indicate that the W is too extreme to be explained
by chance variation - i.e. evidence against normal distribution.
In the present case you have no evidence against normal distribution
(beware that this is not evidence _for_ normality).
**>
(Personally, I'm not too happy about these normality tests. They tend
to lack power in small samples and in large samples they often reject
distributions which are perfectly adequate for normal-theory
analysis. Learning to evaluate a QQ plot seems a better idea.)
**>
**>
> The method description - retrieved using ?shapiro.test() - is a bit
> light on details.
**>
There are references therein, though...
**>
**> --
O__ ---- Peter Dalgaard ุster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907
**>
