RE: [R] grubbs.test

From: Berton Gunter <gunter.berton_at_gene.com>
Date: Fri 15 Apr 2005 - 03:44:18 EST


The Grubbs test is one of many old (1950's - '70's) and classical tests for outliers in linear regression. Here's a link: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm

I think it fair to say that such outlier detection methods were long ago found to be deficient and have poor statistical properties and were supplanted by (computationally much more demanding -- but who cares these days!?) robust/resistant techniques, at least in the more straightforward linear models contexts. rlm() in MASS (the package) is one good implementation of these ideas in R. See MASS (the book by V&R) for a short but informative discussion and further references.

I should add that the use of robust/resistant techniques exposes (i.e., they exist but we statisticians get nervous talking publicly about them) many fundamental issues about estimation vs inference, statistical modeling strategies, etc. The problem is that important estimation and inference issues for R/R estimators remain to be worked out -- if, indeed, it makes sense to think about things this way at all. For example, for various kinds of mixed effects models, "statistical learning theory" ensemble methods, etc. The problem, as always, is what the heck does one mean by "outlier" in these contexts. Seems to be like pornography -- "I know it when I see it."*

Contrary views cheerfully solicited!

Cheers to all,

*Sorry -- that's a reference to a famous quote of Justice Potter Stewart, an American Supreme Court Justice.
http://www.michaelariens.com/ConLaw/justices/stewart.htm  

> -----Original Message-----
> From: r-help-bounces@stat.math.ethz.ch
> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of vito muggeo
> Sent: Thursday, April 14, 2005 7:05 AM
> To: Dave Evens
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] grubbs.test
>
> Dear Dave,
> I do not know the grubbs.test (is it a function, where can I
> find it?)
> and probably n=6 data points are really few..
>
> Having said that, what do you mean as "outlier"?
> If you mean deviation from the estimated mean (of previous data), you
> might have a look to the strucchange package..(sorry, but now
> I do not
> remember the exact name of the function)
>
> best,
> vito
>
>
> Dave Evens wrote:
> > Dear All,
> >
> > I have small samples of data (between 6 and 15) for
> > numerious time series points. I am assuming the data
> > for each time point is normally distributed. The
> > problem is that the data arrvies sporadically and I
> > would like to detect the number of outliers after I
> > have six data points for any time period. Essentially,
> > I would like to detect the number of outliers when I
> > have 6 data points then test whether there are any
> > ouliers. If so, remove the outliers, and wait until I
> > have at least 6 data points or when the sample size
> > increases and test again whether there are any
> > outliers. This process is repeated until there are no
> > more data points to add to the sample.
> >
> > Is it valid to use the grubbs.test in this way?
> >
> > If not, are there any tests out there that might be
> > appropriate for this situation? Rosner's test required
> > that I have at least 25 data points which I don't
> > have.
> >
> > Thank you in advance for any help.
> >
> > Dave
> >
> > ______________________________________________
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
>
> --
> ====================================
> Vito M.R. Muggeo
> Dip.to Sc Statist e Matem `Vianelli'
> UniversitÓ di Palermo
> viale delle Scienze, edificio 13
> 90121 Palermo - ITALY
> tel: 091 6626240
> fax: 091 485726/485612
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Apr 15 03:49:17 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:31:11 EST