Re: [R] efficient code - yet another question

From: <Richard.Cotton_at_hsl.gov.uk>
Date: Thu, 01 May 2008 09:57:30 +0100


> I've been trying to find a way to improve (if
> possible) the efficiency of the code, especially when the number of
> components to calculate is higher than 100.

> pca.nipals <- function(X, ncomp, iter = 50, toler =
sqrt(.Machine$double.eps))
> # X...data matrix, ncomp...number of components,
> # iter...maximal number of iterations per component,
> # toler...precision tolerance for calculation of components
> {
>
> Xh <- scale(X, center = TRUE, scale = FALSE)
> nr <- 0
> T <- NULL; P <- NULL # matrix of scores and loadings
> for (h in 1:ncomp)
> {
> th <- Xh[, 1]
> ende <- FALSE
> while (!ende)
> {
> nr <- nr + 1
> ph <- t(crossprod(th, Xh) * as.vector(1 /
> crossprod(th)))
> ph <- ph * as.vector(1/sqrt(crossprod(ph)))
> thnew <- t(tcrossprod(t(ph), Xh) *
> as.vector(1/(crossprod(ph))))
> prec <- crossprod(th-thnew)
> th <- thnew
> if (prec <= (tol^2)) ende <- TRUE
> if (it <= nr) ende <- TRUE # didn't converge
> }
>
> Xh <- Xh - tcrossprod(th)
> T <- cbind(T, th); P <- cbind(P, ph)
> nr <- 0
> }
> list(T = T, P = P)
> }

First a disclaimer: I've just had a quick eyeball of your code; I haven't run it, so this is speculative.

You've named the tolerance variable as 'toler', but in the line where you come to check it, it is called tol:
if (prec <= (tol^2)) ende <- TRUE

I suspect that this means you have 'tol' as a global variable somewhere, which may or may not be the number you want. Even if the correct variable is being found, I suspect that you want prec <= tol, rather than the square of it (you are probably wasting time on iterations calculating excessive decimal places).

Also, it may be slightly faster to use a for loop instead of the while loop, as so:

for(j in 1:iter)
{
  #your calculations
  if(prec <=tol) break
}

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}

R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 01 May 2008 - 09:07:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 01 May 2008 - 09:30:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive