Re: [R] can I do this with R?

From: Andrew Robinson <>
Date: Thu, 29 May 2008 09:08:55 +1000

On Wed, May 28, 2008 at 03:47:49PM -0700, Xiaohui Chen wrote:
> Frank E Harrell Jr ??????:
> >Xiaohui Chen wrote:
> >>step or stepAIC functions do the job. You can opt to use BIC by
> >>changing the mulplication of penalty.
> >>
> >>I think AIC and BIC are not only limited to compare two pre-defined
> >>models, they can be used as model search criteria. You could
> >>enumerate the information criteria for all possible models if the
> >>size of full model is relatively small. But this is not generally
> >>scaled to practical high-dimensional applications. Hence, it is often
> >>only possible to find a 'best' model of a local optimum, e.g.
> >>measured by AIC/BIC.
> >
> >Sure you can use them that way, and they may perform better than other
> >measures, but the resulting model will be highly biased (regression
> >coefficients biased away from zero). AIC and BIC were not designed to
> >be used in this fashion originally. Optimizing AIC or BIC will not
> >produce well-calibrated models as does penalizing a large model.
> >
> Sure, I agree with this point. AIC is used to correct the bias from the
> estimations which minimize the KL distance of true model, provided the
> assumed model family contains the true model. BIC is designed for
> approximating the model marginal likelihood. Those are all
> post-selection estimating methods. For simutaneous variable selection
> and estimation, there are better penalizations like L1 penalty, which is
> much better than AIC/BIC in terms of consistency.


Tibshirani (1996) suggests that the quality of the L1 penalty depends on the structure of the dataset. As I recall, subset selection was preferred for finding a small number of large effects, lasso (L1) for finding a small to moderate number of moderate-sized effects, and ridge (L2) for many small effects.

Can you provide any references to more up-to-date simulations that you would recommend?



Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
Received on Thu 29 May 2008 - 02:03:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 29 May 2008 - 03:30:48 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive