Re: [R] Robust M-Estimator Comparison

From: Berton Gunter <gunter.berton_at_gene.com>
Date: Wed 24 Aug 2005 - 07:53:37 EST


Johannes:

WARNING: I'm no expert. Caveat emptor!

There is a huge literature on robust estimation, as you'll find when you Google it. One natural place to start might be the relevant sections of V&R's MASS( Modern Applied Statistics with S) and the references therein. An old classic, which may not, however, still be in print, is Hoaglin, Mosteller, Tukey: Understanding robust and exploratory data analysis. (Robust estimation chapter)

It is not clear to me that robust estimation will solve your problems with lots of one-sided outliers -- sounds like a skewed distribution in there somewhere.

One thing to be careful about: there's "Robustness of efficiency" and "Outlier resistance." The first is about maintaining estimation efficiency in the face of "contamination" by a usually small percentage of "outliers" (whatever THEY are); the second is about maintaining estimation accuracy in the face of a possibly large proportion of outliers. The classic example of the latter for estimating location is the median; an M-estimator (e.g. iterated biweight) is an exemplar of the former. As V&R and others makes clear, these are not mutually exclusive, but they do tend to pull in somewhat different ways.

Robust estimation seems to have lost its cachet these days, maybe because it seems to be difficult to do in the nonlinear models that arise out of the complex covariance structures people want to use these days (e.g, mixed models; Empirical Bayes). I continue to find it an essential tool in any routine regression work that I do, however. Seems more in keeping with entropy.

Cheers,

"The business of the statistician is to catalyze the scientific learning process." - George E. P. Box    

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of
> Johannes Graumann
> Sent: Tuesday, August 23, 2005 2:33 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Robust M-Estimator Comparison
>
> Hello,
>
> I'm learning about robust M-estimators right now and had
> settled on the
> "Huber Proposal 2" as implemented in MASS, but further
> reading made clear,
> that at least 2 further weighting functions (Hampel, Tukey
> bisquare) exist.
> In a post from B.D. Ripley going back to 1999 I found the
> following quote:
>
> >> 2) Would huber() give me results that are similar (i.e.,
> close enough)?
> >
> > Not if you have lots of extreme outliers on just one side.
>
> Since this message seems to imply that the nature of the data
> described (and
> not just personal preference) should influence the choice among above
> M-estimators, I've been scouting around for a direct
> comparison among them
> - to no avail.
>
> Can anybody here point me to such a comparison
> (novice-suitability would be
> more than welcome ;0)?
>
> Thanks for any hint,
>
> Joh
>
> --
> +-------------------------------------------------------------
> ---------+
> | Johannes Graumann, Dipl. Biol.
> |
> |
> |
> | Graduate Student Tel.: ++1 (626) 395
> 6602 |
> | Deshaies Lab Fax.: ++1 (626) 395
> 5739 |
> | Department of Biology
> |
> | CALTECH, M/C 156-29
> |
> | 1200 E. California Blvd.
> |
> | Pasadena, CA 91125
> |
> | USA
> |
> +-------------------------------------------------------------
> ---------+
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Aug 24 07:58:29 2005

This archive was generated by hypermail 2.1.8 : Sun 23 Oct 2005 - 15:48:26 EST