From: Frank E Harrell Jr <f.harrell_at_vanderbilt.edu>

Date: Fri, 25 May 2007 17:31:41 -0500

*>
*

> I beg to differ Joseph. I have had many datasets in which the CLT was

*> of no use whatsoever, i.e., where bootstrap confidence limits were
*

*> asymmetric because the data were so skewed, and where symmetric
*

*> normality-based confidence intervals had bad coverage in both tails
*

*> (though correct on the average). I see this the opposite way:
*

*> nonparametric tests works fine if normality holds.
*

*>
*

*> Note that the CLT helps with type I error but not so much with type II
*

*> error.
*

*>
*

*> Frank
*

*>
*

*>
*

*>
*

*> --
*

*> Frank E Harrell Jr Professor and Chair School of Medicine
*

> Department of Biostatistics Vanderbilt University

*>
*

*> ______________________________________________
*

*> R-help_at_stat.math.ethz.ch mailing list
*

*> https://stat.ethz.ch/mailman/listinfo/r-help
*

*> PLEASE do read the posting guide
*

*> http://www.R-project.org/posting-guide.html
*

*> and provide commented, minimal, self-contained, reproducible code.
*

*>
*

*>
*

*>
*

*>
*

*>
*

Date: Fri, 25 May 2007 17:31:41 -0500

Cody_Hamilton_at_Edwards.com wrote:

> Following up on Frank's thought, why is it that parametric tests are so

*> much more popular than their non-parametric counterparts? As
**> non-parametric tests require fewer assumptions, why aren't they the
**> default? The relative efficiency of the Wilcoxon test as compared to the
**> t-test is 0.955, and yet I still see t-tests in the medical literature all
**> the time. Granted, the Wilcoxon still requires the assumption of symmetry
**> (I'm curious as to why the Wilcoxon is often used when asymmetry is
**> suspected, since the Wilcoxon assumes symmetry), but that's less stringent
**> than requiring normally distributed data. In a similar vein, one usually
**> sees the mean and standard deviation reported as summary statistics for a
**> continuous variable - these are not very informative unless you assume the
**> variable is normally distributed. However, clinicians often insist that I
**> included these figures in reports.
**>
**> Cody Hamilton, PhD
**> Edwards Lifesciences
*

Well said Cody, just want to add that Wilcoxon does not assume symmetry if you are interested in testing for stochastic ordering and not just for a mean.

Frank

*>
**>
**>
**>
**> Frank E Harrell
*

> Jr

*> <f.harrell_at_vander To
**> bilt.edu> "Lucke, Joseph F"
**> Sent by: <Joseph.F.Lucke_at_uth.tmc.edu>
**> r-help-bounces_at_st cc
**> at.math.ethz.ch r-help <r-help_at_stat.math.ethz.ch>
**> Subject
**> Re: [R] normality tests
**> 05/25/2007 02:42 [Broadcast]
**> PM
**>
**>
**>
**>
**>
**>
**>
**>
**>
**> Lucke, Joseph F wrote:
*

>> Most standard tests, such as t-tests and ANOVA, are fairly resistant to >> non-normalilty for significance testing. It's the sample means that have >> to be normal, not the data. The CLT kicks in fairly quickly. Testing >> for normality prior to choosing a test statistic is generally not a good >> idea.

> I beg to differ Joseph. I have had many datasets in which the CLT was

>> -----Original Message----- >> From: r-help-bounces_at_stat.math.ethz.ch >> [mailto:r-help-bounces_at_stat.math.ethz.ch] On Behalf Of Liaw, Andy >> Sent: Friday, May 25, 2007 12:04 PM >> To: gatemaze_at_gmail.com; Frank E Harrell Jr >> Cc: r-help >> Subject: Re: [R] normality tests [Broadcast] >> >> From: gatemaze_at_gmail.com >>> On 25/05/07, Frank E Harrell Jr <f.harrell_at_vanderbilt.edu> wrote: >>>> gatemaze_at_gmail.com wrote: >>>>> Hi all, >>>>> >>>>> apologies for seeking advice on a general stats question. I ve run >>>>> normality tests using 8 different methods: >>>>> - Lilliefors >>>>> - Shapiro-Wilk >>>>> - Robust Jarque Bera >>>>> - Jarque Bera >>>>> - Anderson-Darling >>>>> - Pearson chi-square >>>>> - Cramer-von Mises >>>>> - Shapiro-Francia >>>>> >>>>> All show that the null hypothesis that the data come from a normal >>>>> distro cannot be rejected. Great. However, I don't think >>> it looks nice >>>>> to report the values of 8 different tests on a report. One note is >>>>> that my sample size is really tiny (less than 20 >>> independent cases). >>>>> Without wanting to start a flame war, are there any >>> advices of which >>>>> one/ones would be more appropriate and should be reported >>> (along with >>>>> a Q-Q plot). Thank you. >>>>> >>>>> Regards, >>>>> >>>> Wow - I have so many concerns with that approach that it's >>> hard to know >>>> where to begin. But first of all, why care about >>> normality? Why not >>>> use distribution-free methods? >>>> >>>> You should examine the power of the tests for n=20. You'll probably >>>> find it's not good enough to reach a reliable conclusion. >>> And wouldn't it be even worse if I used non-parametric tests? >> I believe what Frank meant was that it's probably better to use a >> distribution-free procedure to do the real test of interest (if there is >> one) instead of testing for normality, and then use a test that assumes >> normality. >> >> I guess the question is, what exactly do you want to do with the outcome >> of the normality tests? If those are going to be used as basis for >> deciding which test(s) to do next, then I concur with Frank's >> reservation. >> >> Generally speaking, I do not find goodness-of-fit for distributions very >> useful, mostly for the reason that failure to reject the null is no >> evidence in favor of the null. It's difficult for me to imagine why >> "there's insufficient evidence to show that the data did not come from a >> normal distribution" would be interesting. >> >> Andy >> >> >>>> Frank >>>> >>>> >>>> -- >>>> Frank E Harrell Jr Professor and Chair School >>> of Medicine >>>> Department of Biostatistics >>> Vanderbilt University >>> >>> -- >>> yianni >>> >>> ______________________________________________ >>> R-help_at_stat.math.ethz.ch mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> >> >> ------------------------------------------------------------------------ >> ------ >> Notice: This e-mail message, together with any >> attachments,...{{dropped}} >> >> ______________________________________________ >> R-help_at_stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >>

> Department of Biostatistics Vanderbilt University

-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help_at_stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.Received on Fri 25 May 2007 - 22:45:24 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Sat 26 May 2007 - 01:31:21 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*