Re: [R] Comparison of linear models

From: Andrew Robinson <A.Robinson_at_ms.unimelb.edu.au>
Date: Sat 29 Jul 2006 - 07:15:30 EST

I have one addition to Rolf's thorough advice: if your goal is to try to find evidence that the two procedures are equivalent then the tests that you should consider are called equivalence tests. These do not come from lm.

The most popular test is TOST, the two one-sided test, and it doesn't really require a package to implement. Briefly, the alpha=0.05 test might proceed as follows.

  1. You establish a subjective interval around the value that you wish to test. In the case of trying to assess the evidence that two population means for measured heights are the same, for example, you might say that the subjective interval for the difference between the two means is 0, +/- 2 cm. The magnitude of the interval depends on what you think is an important deviation.
  2. Compute two one-sided 1-alpha confidence intervals for the difference between the two means, one upper, and one lower. Take the intersection of the two intervals. (NB in this example it is mathematically equivalent to a single, two-sided 1-2*alpha confidence interval but this is only true in simple cases).
  3. If the intersection is entirely within the subjective interval established in step 1) then you reject the null hypothesis of difference between the population means.

There is not very much literature on the question. The originating articles are:

@Article{schuirmann-1981,
author = {D. L. Schuirmann},
title ={On hypothesis testing to determine if the mean of a normal distribution is contained in a known interval}, journal ={Biometrics},
year = 1981,
volume = 37,
pages =617
}

@Article{westlake-1981,
author = {W. J. Westlake},
title ={Response to {T.B.L. Kirkwood}: bioequivalence testing--a need to rethink}, journal ={Biometrics},
year = 1981,
volume = 37,
pages ={589--594}
}  

I also recommend:

@Article{BH96:equivalence,

  author =       {R. L. Berger and J. C. Hsu},
  title =        {Bioequivalence trials, intersection-union tests and
  equivalenc
e confidence sets},
  journal =      {Statistical Science},
  year =         1996,
  volume =       11,
  number =       4,
  pages =        {283--319}

}

Finally, there is a nice recent book:

@Book{W03:equivalence,

  author =       {S. Wellek},
  title =        {Testing statistical hypotheses of equivalence},
  publisher =    {Chapman and Hall/CRC},
  year =         2003

}

There is also an equivalence package on CRAN, which has some other tests, graphical procedures, and references to some expository articles (mine and others).

Cheers

Andrew

On Fri, Jul 28, 2006 at 09:10:21AM -0300, Rolf Turner wrote:
>
> Fabien Lebugle wrote:
>
> > I am a master student. I am currently doing an internship. I would
> > like to get some advices about the following issue: I have 2 data
> > sets, both containing the same variables, but the data were measured
> > using two different procedures. I want to know if the two procedures
> > are equivalent. Up to know, I have built one linear model for each
> > dataset. The two models have the same form. I would like to compare
> > these two models: are they identical? Are they different? By how
> > much?
> >
> > Please, could you tell me which R procedure I should use? I have been
> > searching the list archive, but without success...
>
> This is not a question of ``which R procedure'' but rather a
> question of understanding a bit about statistics and linear
> models. You say you are a ``master's student''; I hope you
> are not a master's student in *statistics*, given that you
> lack this (very) basic knowledge! If you are a student in
> some other discipline, I guess you may be forgiven.
>
> The ``R procedure'' that you need to use is just lm()!
>
> Briefly, what you need to do is combine your two data
> sets into a *single* data set (using rbind should work),
> add in a grouping variable (a factor with two levels,
> one for each measure procedure) e.g.
>
> my.data$gp <- factor(rep(c(1,2),c(n1,n2)))
>
> where n1 and n2 are the sample sizes for procedure 1 and
> procedure 2 respectively.
>
> Then fit linear models with formulae involving the
> grouping factor (``gp'') as well as the other predictors,
> and test for the ``significance'' of the terms in
> the model that contain ``gp''. You might start with
>
> fit <- lm(y~.*gp,data=my.data)
> anova(fit)
>
> where ``y'' is (of course) your reponse.
>
> You ought to study up on the underlying ideas of inference
> for linear models, and the nature of ``factors''. John Fox's
> book ``Applied Regression Analysis, Linear Models, and
> Related Methods'' might be a reasonable place to start.
>
> Bon chance.
>
> cheers,
>
> Rolf Turner
> rolf@math.unb.ca
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
Email: a.robinson_at_ms.unimelb.edu.au         http://www.ms.unimelb.edu.au

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Jul 29 07:41:17 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sat 29 Jul 2006 - 10:16:56 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.