[R] On the speed of apply and alternatives?

From: Monty B. <montezumasrevenge_at_gmail.com>
Date: Tue 09 May 2006 - 07:45:14 EST

Dear all,

I have to handle a large matrix (1000 x 10001) where in the last column i have a value that all the preceding values in the same row has to be compared to.

I have made the following code :

# generate a (1000 x 10001) matrix, testm
# generate statistics matrix 1000 x 4:

qnt <- c(0.01, 0.05)
cmp_fun <- function(x)
  LAST <- length(x)
  smpls <- x[1:(LAST-1)]
  real <- x[LAST]

  ret <- vector(length=length(qnt)*2)
  for (i in 1:length(qnt))

    q_i  <- quantile(smpls, qnt[i])            # the quantile i
    m_i <- mean(smpls[smpls<q_i ] )     # mean of obs less than q_i
    ret[i] <- ifelse(real < q_i, 1, 0)
    ret[length(qnt)+i] <- ifelse(real < q_i, real - m_i, 0)   }
hcvx <- apply(testm, 1, cmp_fun)

The code is functioning well, but seems to take forever to calculate the statistics matrix. As I have to repeat this snippet 2000 times, I have a problem. Can anyone advise as to how I can optimize the runtime of this problem? Should i drop the apply function altogether and just loop through the rows with a for loop? Does anyone know of matrix functions I can use to do the same operations I use within the cmp_fun function to avoid this looping?

All suggestions are welcome! I have little experience optimizing code in R, so I am quite stumped at the moment.



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Tue May 09 09:00:02 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Tue 09 May 2006 - 22:10:01 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.