From: <Bill.Venables_at_csiro.au>

Date: Wed 06 Apr 2005 - 13:25:58 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Apr 06 13:30:42 2005

Date: Wed 06 Apr 2005 - 13:25:58 EST

This is possible if x and z are orthogonal, but in general it doesn't
work as you have noted. (If it did work it would almost amount to a way
of inverting geenral square matrices by working one row at a time, no
going back...)

It is possible to fit a bivariate regression using simple linear regression techniques iteratively like this, but it is a bit more involved than your two step process.

- regress y on x and take the residuals: ryx <- resid(lm(y ~ x))
- regress z on x and take the residuals: rzx <- resid(lm(z ~ x))
- regress ryx on rzx: fitz <- lm(ryx ~ rzx)
- this gives you the estimate of the coefficient on z (what you call below b2): b2 <- coef(fitz)[2]
- regress y - b2*z on x: fitx <- lm(I(y - b2*z) ~ x)

This last step gets you the estimates of b0 and b1.

None of this works with significances, though, because in going about it this way you have essentially disguised the degrees of freedom involved. So you can get the right estimates, but the standard errors, t-statistics and residual variances are all somewhat inaccurate (though usually not by much).

If x and z are orthogonal the (curious looking) step 2 is not needed.

This kind of idea lies behind some algorithms (e.g. Stevens' algorithm) for fitting very large regressions essentially by iterative processes to avoid constructing a huge model matrix.

Bill Venables

-----Original Message-----

From: r-help-bounces@stat.math.ethz.ch

[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of John Sorkin
Sent: Wednesday, 6 April 2005 12:55 PM

To: r-help@stat.math.ethz.ch

Subject: [R] two methods for regression, two different results

Please forgive a straight stats question, and the informal notation.

let us say we wish to perform a liner regression: y=b0 + b1*x + b2*z

I would think the two methods would give the same p value and the same beta coefficient for z. The don't. Can someone help my understand why the two methods do not give the same results. Additionally, could someone tell me when one method might be better than the other, i.e. what question does the first method anwser, and what question does the second method answer. I have searched a number of textbooks and have not found this question addressed.

Thanks,

John

John Sorkin M.D., Ph.D.

Chief, Biostatistics and Informatics

Baltimore VA Medical Center GRECC and

University of Maryland School of Medicine Claude Pepper OAIC

University of Maryland School of Medicine
Division of Gerontology

Baltimore VA Medical Center

10 North Greene Street

**GRECC (BT/18/GR)
**

Baltimore, MD 21201-1524

410-605-7119

**-- NOTE NEW EMAIL ADDRESS:
**

jsorkin@grecc.umaryland.edu

[[alternative HTML version deleted]]

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Apr 06 13:30:42 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:31:02 EST
*