[R] vectorizing sapply() code (Modified by Aaron J. Mackey)

From: Aaron J. Mackey <amackey_at_pcbi.upenn.edu>
Date: Tue 06 Jul 2004 - 22:17:55 EST


[ Not sure why, but the first time I sent this it never seemed to go through; apologies if you're seeing this twice ... ]

I have some fully functional code that I'm guessing can be done better/quicker with some savvy R vector tricks; any help to make this run a bit faster would be greatly appreciated; I'm particularly stuck on how to calculate using "row-wise" vectors without iterating explicitly over the dataframe or table ...

library(stats4);
d <- data.frame( ix=c(0,1,2,3,4,5,6,7),

                  ct=c(253987,  9596, 18680,  2630,  8224,  3590,  5534, 
18937),
                  A=c(      0,     1,     0,     1,     0,     1,     0, 
     1),
                  B=c(      0,     0,     1,     1,     0,     0,     1, 
     1),
                  C=c(      0,     0,     0,     0,     1,     1,     1, 
     1)
                );

ct <- round(logb(length(d$ix), 2))
ll <- function( th=0.5,
                 a1=log(0.5), a2=log(0.5), a3=log(0.5),
                 b1=log(0.5), b2=log(0.5), b3=log(0.5)
               ) {

   a <- exp(sapply(1:ct, function (x) { get(paste("a", x, sep="")) }));    b <- exp(sapply(1:ct, function (x) { get(paste("b", x, sep="")) }));    -sum( d$ct * log( sapply( d$ix,
                             function (ix, th, a, b) {
                               x <- d[ix+1,3:(ct+2)]
                               (th     * prod((b ^ (1-x)) * ((1-b) ^ x   
  ))) +
                               ((1-th) * prod((a ^ x    ) * ((1-a) ^ 

(1-x))))
}, th, a, b ) )

   );
}

ml <- mle(ll,

           lower=c(0+1e-5, rep(log(0+1e-8), 2*ct)),
           upper=c(1-1e-5, rep(log(1-1e-8), 2*ct)),
           method="L-BFGS-B"
          );

For those interested in the math, this is the MLE procedure to estimate the false positive/false negative rates (a and b) of three diagnostic
(A, B and C) tests that have the observed performance recapitulated in
dataframe "d", but no "gold standard" (sometimes called "latent class analysis", or LCA).

Thanks for any help,

-Aaron



R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jul 07 21:58:40 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 02:34:51 EST