Re: [R] model simplification using Crawley as a guide

From: Frank E Harrell Jr <>
Date: Wed, 11 Jun 2008 17:53:06 -0500

Ben Bolker wrote:
> Lucke, Joseph F <Joseph.F.Lucke <at>> writes:

>> And to follow FH and HW
>> What level of significance are you using? .05 is excessively liberal.
>> Are you adjusting your p-values for the number of possible models? Do
>> you realize the p-values for dropping a term, being selected as the
>> maximum of a set of p-values, do not follow their usual distributions?
>> How are you compensating for sample size, as a p-value's being
>> significant is a function of sample size?  How are you compensating for
>> the fact that the current model choice is dependent on the previous
>> model choices? How do you know your tree of model choices is the optimal
>> one?  Have you considered cross-validation?  Are you looking for a model
>> that true describes a phenomenon or a predictive model that can be used
>> for practical purposes?

> Ouch. While Frank Harrell and Joseph Lucke are raising
> serious issues about model selection, maybe we could keep in mind that
> we don't want to scare off all the students who ever try to use R
> to figure out basic statistics. I would follow Peter Dalgaard's advice
> (about "drop1") and Hadley Wickham's (about graphical diagnostics),
> and if possible bring up the other issues about
> model selection with others around you -- if you're a student, ask
> your prof. or someone in the stats department. It can be tough
> to try to do things right if those around you are still
> doing them wrong ... If you tell us what field you're in we
> may be able to point you to more subject-specific references
> (e.g. Whittingham, Mark J., Philip A. Stephens, Richard B. Bradbury, and Robert
> P. Freckleton. 2006. Why do we still use stepwise modelling in ecology and
> behaviour? Journal of Animal Ecology 75, no. 5: 1182-1189)
> Ben Bolker

Good points Ben. For now I'd recommend simply that the allergic reaction to insignificant statistical tests be treated with an antihistimine :-)


> ______________________________________________
> mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.

Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Wed 11 Jun 2008 - 23:01:20 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 12 Jun 2008 - 03:30:53 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive