From: François Pinard <pinard_at_iro.umontreal.ca>

Date: Tue 09 May 2006 - 21:51:00 EST

}

Date: Tue 09 May 2006 - 21:51:00 EST

[Monty B. ]

>I have to handle a large matrix (1000 x 10001) where in the

*>last column i have a value that all the preceding values in the same row
**>has to be compared to. I have made the following code :
*

># generate a (1000 x 10001) matrix, testm

*># generate statistics matrix 1000 x 4:
*

>qnt <- c(0.01, 0.05)

*>cmp_fun <- function(x)
**>{
**> LAST <- length(x)
**> smpls <- x[1:(LAST-1)]
**> real <- x[LAST]
*

> ret <- vector(length=length(qnt)*2)

*> for (i in 1:length(qnt))
**> {
**> q_i <- quantile(smpls, qnt[i]) # the quantile i
**> m_i <- mean(smpls[smpls<q_i ] ) # mean of obs less than q_i
**> ret[i] <- ifelse(real < q_i, 1, 0)
**> ret[length(qnt)+i] <- ifelse(real < q_i, real - m_i, 0)
**> }
**> ret
**>}
**>hcvx <- apply(testm, 1, cmp_fun)
*

>Can anyone advise as to how I can optimize the runtime of this problem?

*>All suggestions are welcome!
*

You may speed it up a bit, not so much, with the following:

stats.testm <- function (testm, qnt=c(0.01, 0.05)) {

quants <- apply(testm[, 1:(ncol(testm)-1)], 1, quantile, qnt)

smpls <- testm[rep(1:nrow(testm), each=length(qnt)), 1:(ncol(testm)-1)] reals <- testm[rep(1:nrow(testm), each=length(qnt)), ncol(testm)] keeps <- smpls < rep(quants, ncol(smpls)) means <- rowSums(smpls * keeps) / rowSums(keeps) matrix(rbind((reals < quants) + 0, (reals < quants) * (reals - means)), length(qnt) * 2)

}

Try it with something like:

gen.testm <- function (n, m) {

matrix(sample(0:99, n * (m + 1), TRUE), n) }

testm <- gen.testm(100, 100)

stats.testm(testm)

Without checking, I would suspect that quantile is the big consumer. If you could make it without quantile interpolation, maybe some more vectorisation could be possible, but in any case, I do not think you can avoid sorting each row separately, in one way or another (currently done within quantile).

-- François Pinard http://pinard.progiciels-bpi.ca ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlReceived on Tue May 09 21:55:52 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Tue 09 May 2006 - 22:10:01 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*