[R] Can I do regression by steps?

From: rlearner309 <unixunix99_at_gmail.com>
Date: Tue, 08 Jul 2008 05:53:22 -0700 (PDT)

I saw this type of models in some of my company projects.

To simplify:
Y is regressed on X1 and X2. But the regression is done by two steps: First Y is regressed on X1 with intercept, and the residuals from the first step are used to regress on X2, without the constant. The reason to do so is some observations have X1 data but do not have X2, so I guess the person wants to use as much information as he can to get the coef. for X1, and then use part of the residuals (that have X2 data) to catch what is left to be explained by X2.

But my concern is, should we consider the correlation between X1 and X2? If residuals from the first step are used, then X1 effect has been removed. Then what does it really mean by regressing residuals on X2, which has some X1 effect correlated with?? should X2 be adjusted by X1, too (regress X2 on X1 and use the residuals)?

What if both X1 and X2 are dummy variables? Dummy variables can have a meaningful correlation, too, right?

Thanks a lot!

View this message in context: http://www.nabble.com/Can-I-do-regression-by-steps--tp18338562p18338562.html
Sent from the R help mailing list archive at Nabble.com.

R-help_at_r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Tue 08 Jul 2008 - 15:38:28 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 08 Jul 2008 - 23:31:47 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive