From: <Bill.Venables_at_csiro.au>

Date: Wed 31 Jan 2007 - 01:36:47 GMT

There are a couple of things about this you should be aware of, though First, this is just a fiddly way of finding the first principal component, so your desire not to use Principal Component Analysis is somewhat thwarted, as it must be.

Second, the result is sensitive to scale - if you change the scales of either x or y, e.g. from lbs to kilograms, the answer is different. This also means that unless your measurement units for x and y are comparable it's hard to see how the result can make much sense. A related issue is that you have to take some care when plotting the result or orthogonal distances will not appear to be orthogonal. Third, the resulting line is not optimal for either predicting y for a new x or x from a new y. It's hard to see why it is ever of much interest.

Bill Venables.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 31 12:42:58 2007

Date: Wed 31 Jan 2007 - 01:36:47 GMT

-----Original Message-----

From: r-help-bounces@stat.math.ethz.ch

[mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Jonathon Kopecky

Sent: Tuesday, 30 January 2007 5:52 AM

To: r-help@stat.math.ethz.ch

Subject: [R] Need to fit a regression line using orthogonal residuals

I'm trying to fit a simple linear regression of just Y ~ X, but both X
and Y are noisy. Thus instead of fitting a standard linear model
minimizing vertical residuals, I would like to minimize
orthogonal/perpendicular residuals. I have tried searching the
R-packages, but have not found anything that seems suitable. I'm not
sure what these types of residuals are typically called (they seem to
have many different names), so that may be my trouble. I do not want to
use Principal Components Analysis (as was answered to a previous
questioner a few years ago), I just want to minimize the combined noise
of my two variables. Is there a way for me to do this in R?

[WNV] There's always a way if you are prepared to program it. Your

question is a bit like asking "Is there a way to do this in Fortran?"
The most direct way to do it is to define a function that gives you the
sum of the perpendicular distances and minimise it using, say, optim().
E.g.

ppdis <- function(b, x, y) sum((y - b[1] - b[2]*x)^2/(1+b[2]^2)) b0 <- lsfit(x, y)$coef # initial value op <- optim(b0, ppdis, method = "BFGS", x=x, y=y) op # now to check the results plot(x, y, asp = 1) # why 'asp = 1'?? exercise abline(b0, col = "red") abline(op$par, col = "blue")

There are a couple of things about this you should be aware of, though First, this is just a fiddly way of finding the first principal component, so your desire not to use Principal Component Analysis is somewhat thwarted, as it must be.

Second, the result is sensitive to scale - if you change the scales of either x or y, e.g. from lbs to kilograms, the answer is different. This also means that unless your measurement units for x and y are comparable it's hard to see how the result can make much sense. A related issue is that you have to take some care when plotting the result or orthogonal distances will not appear to be orthogonal. Third, the resulting line is not optimal for either predicting y for a new x or x from a new y. It's hard to see why it is ever of much interest.

Bill Venables.

Jonathon Kopecky

University of Michigan

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 31 12:42:58 2007

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Wed 31 Jan 2007 - 08:30:29 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*