Re: [R] Slow function

From: Marc <marc.moragues_at_gmail.com>
Date: Tue, 10 Jun 2008 17:11:24 +0100

Hi Jim,

This is genotype data of 170 samples. I selected subsets of SNP optimized for different types of germplasm. So it is a matrix with 170 rows and 1536, 384 or 96 columns of binary data (0, 1). I have 14 of such matrices in a list.

x <- list()
for (i in 1:14) {
 set.seed(i)
 x[[i]] <- matrix(sample(rep(c(1,0), 1000000), 1536*170), nrow = 170, ncol = 1536)
}

Thanks,
Marc.

jim holtman wrote:
> I have no idea of what your data looks like, so using random numbers
> and only going for nr=1, after about a minute I stopped it. Here is
> what Rprof showed:
>
> /cygdrive/c/perf: perl c:/perf/bin/readRprof.pl Rprof.out 1
> 0 75.8 root
> 1. 75.7 sapply
> 2. . 75.7 lapply
> 3. . . 75.7 FUN
> 4. . . . 75.6 as.dist
> 5. . . . . 75.6 distance
> 6. . . . . . 75.6 distance.default
> 7. . . . . . . 75.4 apply
> 8. . . . . . . . 73.8 FUN
> 9. . . . . . . . . 73.8 switch
> 10. . . . . . . . . . 73.8 apply
> 11. . . . . . . . . . . 63.4 FUN
> 12. . . . . . . . . . . . 6.6 !
> 12. . . . . . . . . . . . 2.8 -
> 12. . . . . . . . . . . . 2.5 any
> 12. . . . . . . . . . . . 2.2 /
> 12. . . . . . . . . . . . 1.7 sum
> 12. . . . . . . . . . . . 1.6 *
> 11. . . . . . . . . . . 2.3 aperm
> 11. . . . . . . . . . . 1.0 unlist
> 8. . . . . . . . 1.5 join
>
> This says almost all the time is in the 'distance' function. Try
> running your data with 'nr' very small and see what happens.
>
> On Tue, Jun 10, 2008 at 4:49 AM, Marc <marc.moragues_at_gmail.com
> <mailto:marc.moragues_at_gmail.com>> wrote:
>
> Hi,
>
> I have the following function that I want to apply to a list of 14
> matrices (1536 x 170) of binary data:
>
> DRes <- function(x, nr = 10000, metric = "mixed", ...) {
> require(analogue)
> require(ade4)
> m <- c()
> for (i in 1:nr) {
> set.seed(i)
> x1 <- x[, sample(dimnames(x)[[2]], length(x[1,])/2)]
> x2 <- x[, !dimnames(x)[[2]] %in% dimnames(x1)[[2]]]
> d1 <- as.dist(distance(as.data.frame(x1), method = metric))
> d2 <- as.dist(distance(as.data.frame(x2), method = metric))
> m[i] <- mantel.rtest(d1, d2, ...)$obs
> mean <- mean(m)
> std <- sd(m)
> res <- list(mean = mean, std = std)
> }
> return(res)
> }
> bias.dres <- sapply(bias, DRes)
>
> I run this code and after 3 hours is still running. I am on
> Windows XP and this is my sessionInfo()
> > sessionInfo()
> R version 2.7.0 Patched (2008-05-02 r45580)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
> Kingdom.1252;LC_MONETARY=English_United
> Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> base
> other attached packages:
> [1] analogue_0.5-1 vegan_1.11-4 ade4_1.4-7
> Any help will be very much appreciated.
> Marc.
>
> ______________________________________________
> R-help_at_r-project.org <mailto:R-help_at_r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 10 Jun 2008 - 17:42:11 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 10 Jun 2008 - 18:30:40 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive