# Re: [R] Can I do regression by steps?

From: rlearner309 <unixunix99_at_gmail.com>
Date: Tue, 08 Jul 2008 15:25:47 -0700 (PDT)

John Sorkin wrote:
>
> Be very careful!
> When regression is performed by steps, you often will not get the same
> results as you would get from a single multivariable regression. The
> explanation for this is not simple, but a simplified explanation is that
> when you do your first regression,
> y=f(x1)
> all the total variance that can be accounted for is sucked up by x1
> leaving little varinace to be accounted for by your second regression,
> residuals=f(x2). In contrast when you perform a multivariable regression,
> y=f(x1,x2) the total variance is proportioned between x1 and x2.
> John
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>

```>>>> rlearner309 <unixunix99_at_gmail.com> 7/8/2008 8:53 AM >>>
```

>
> I saw this type of models in some of my company projects.
>
> To simplify:
> Y is regressed on X1 and X2. But the regression is done by two steps:
> First Y is regressed on X1 with intercept, and the residuals from the
> first
> step are used to regress on X2, without the constant. The reason to do so
> is some observations have X1 data but do not have X2, so I guess the
> person
> wants to use as much information as he can to get the coef. for X1, and
> then
> use part of the residuals (that have X2 data) to catch what is left to be
> explained by X2.
>
> But my concern is, should we consider the correlation between X1 and X2?
> If
> residuals from the first step are used, then X1 effect has been removed.
> Then what does it really mean by regressing residuals on X2, which has
> some
> X1 effect correlated with?? should X2 be adjusted by X1, too (regress X2
> on
> X1 and use the residuals)?
>
> What if both X1 and X2 are dummy variables? Dummy variables can have a
> meaningful correlation, too, right?
>
> Thanks a lot!
> --
> View this message in context:
> http://www.nabble.com/Can-I-do-regression-by-steps--tp18338562p18338562.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> Confidentiality Statement:
> This email message, including any attachments, is for th...{{dropped:6}}
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
```--
View this message in context: http://www.nabble.com/Can-I-do-regression-by-steps--tp18338562p18350475.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help