# [R] replacing a for-loop with lapply

From: Daniel Berg <daniel_at_nr.no>
Date: Tue 10 May 2005 - 02:31:50 EST

Dear All,

I am trying to compute a goodness-of-fit statistic for a copula, based on an empirical density estimate of this copula. To do this I can use the following code:

> n <- dim(data)[1]
> d <- dim(data)[2]
> Chat <- rep(0,n)
> for(i in 1:n)

+ Chat[i] <- sum(apply(t(data)<=data[i,],2,prod))/(n+1)

However, I have a feeling this can be done more effectively than using a for-loop. I have also tried the following:

> tmp1 <- lapply(1:n,function(i) t(data)<=data[i,])
> tmp2 <- lapply(1:n,function(i) apply(tmp1[[i]],2,prod))
> Chat <- as.numeric(lapply(1:n, function(i) sum(tmp2[[i]])))

but there is no improvement. I ran the following timing test:

```+ a[m] <- system.time({
+ tmp1 <- lapply(1:n,function(i) t(data)<=data[i,])
+ tmp2 <- lapply(1:n,function(i) apply(tmp1[[i]],2,prod))
+ Chat <- as.numeric(lapply(1:n, function(i) sum(tmp2[[i]])))})[3]}
```

> b <- rep(0,M)
> for(m in 1:30){
```+ b[m] <- system.time(
+ for (i in 1:n)
+ Chat[i] = sum(apply(t(data)<=data[i,],2,prod))/(n+1))[3]}
```

> summary(a)
> summary(b)

and the output was:

> summary(a)

Min. 1st Qu. Median Mean 3rd Qu. Max.  0.8500 0.8700 0.8900 0.9013 0.9300 0.9800
> summary(b)

Min. 1st Qu. Median Mean 3rd Qu. Max.  0.8400 0.8600 0.8800 0.8883 0.9075 0.9900

Is there any way I can code this more efficiently in R or will I have to turn to C? The data sets, on which I am actually going to run this code, will be of sizes up to (5000x100) and I need hundreds of realizations...