From: Henrik Bengtsson <hb_at_stat.berkeley.edu>

Date: Wed 31 Jan 2007 - 02:18:56 GMT

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 31 13:23:47 2007

Date: Wed 31 Jan 2007 - 02:18:56 GMT

The iwpca() in Bioconductor package aroma.light takes a matrix of column vectors and "fits an R-dimensional hyperplane using iterative re-weighted PCA". From ?iwpca.matrix:

"Arguments:

X N-times-K matrix where N is the number of observations and K is the
number of dimensions.

Details:

This method uses weighted principal component analysis (WPCA) to fit a
R-dimensional hyperplane through the data with initial internal
weights all equal. At each iteration the internal weights are
recalculated based on the "residuals". If method=="L1", the internal
weights are 1 / sum(abs(r) + reps). This is the same as
method=function(r) 1/sum(abs(r)+reps). The "residuals" are orthogonal
Euclidean distance of the principal components R,R+1,...,K. In each
iteration before doing WPCA, the internal weighted are multiplied by
the weights given by argument w, if specified."

Thus, in your case you want to do:

X <- cbind(x,y)

fit <- iwpca(X)

There are different ways of robustifying the estimate, cf. argument 'method'. For heteroscedastic noise, fitting in L1 is convenient since there is no bandwidth parameter.

Hope this helps

Henrik

On 1/30/07, Bill.Venables@csiro.au <Bill.Venables@csiro.au> wrote:

> Jonathon Kopecky asks:

*>
**> -----Original Message-----
**> From: r-help-bounces@stat.math.ethz.ch
**> [mailto:r-help-bounces@stat.math.ethz.ch] On Behalf Of Jonathon Kopecky
**> Sent: Tuesday, 30 January 2007 5:52 AM
**> To: r-help@stat.math.ethz.ch
**> Subject: [R] Need to fit a regression line using orthogonal residuals
**>
**> I'm trying to fit a simple linear regression of just Y ~ X, but both X
**> and Y are noisy. Thus instead of fitting a standard linear model
**> minimizing vertical residuals, I would like to minimize
**> orthogonal/perpendicular residuals. I have tried searching the
**> R-packages, but have not found anything that seems suitable. I'm not
**> sure what these types of residuals are typically called (they seem to
**> have many different names), so that may be my trouble. I do not want to
**> use Principal Components Analysis (as was answered to a previous
**> questioner a few years ago), I just want to minimize the combined noise
**> of my two variables. Is there a way for me to do this in R?
**> [WNV] There's always a way if you are prepared to program it. Your
**> question is a bit like asking "Is there a way to do this in Fortran?"
**> The most direct way to do it is to define a function that gives you the
**> sum of the perpendicular distances and minimise it using, say, optim().
**> E.g.
**> ppdis <- function(b, x, y) sum((y - b[1] - b[2]*x)^2/(1+b[2]^2))
**> b0 <- lsfit(x, y)$coef # initial value
**> op <- optim(b0, ppdis, method = "BFGS", x=x, y=y)
**> op # now to check the results
**> plot(x, y, asp = 1) # why 'asp = 1'?? exercise
**> abline(b0, col = "red")
**> abline(op$par, col = "blue")
**> There are a couple of things about this you should be aware of, though
**> First, this is just a fiddly way of finding the first principal
**> component, so your desire not to use Principal Component Analysis is
**> somewhat thwarted, as it must be.
**> Second, the result is sensitive to scale - if you change the scales of
**> either x or y, e.g. from lbs to kilograms, the answer is different.
**> This also means that unless your measurement units for x and y are
**> comparable it's hard to see how the result can make much sense. A
**> related issue is that you have to take some care when plotting the
**> result or orthogonal distances will not appear to be orthogonal.
**> Third, the resulting line is not optimal for either predicting y for a
**> new x or x from a new y. It's hard to see why it is ever of much
**> interest.
**> Bill Venables.
**>
**>
**> Jonathon Kopecky
**> University of Michigan
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
**> http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Wed Jan 31 13:23:47 2007

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Wed 31 Jan 2007 - 03:30:27 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*