Re: [R] Replacing for loop with tapply!?

From: Sander Oom <slist_at_oomvanlieshout.net>
Date: Sun 12 Jun 2005 - 22:37:12 EST

Your solution (the second function) is definitely the most elegant and generic solution of all replies in this discussion. Robust for missing values and flexible to allow as many calculations as desired! It is so clear, I even managed to hack it (of course also thanks to the new insight from all the other posts)!

As the data consists of weather stations in rows and days in columns, I have adapted the function to work on rows instead of columns. Did not manage to get the results directly into the right rows/cols layout, so a transpose (t) is still required. However this seems instant, so does not mean a reduction in speed! Calculating proportions is now a snip!!

Thanks for you help,

Sander.

### simulate data
set.seed(1) # for reproducibility mat <- matrix(sample(-15:50, 15 * 10, TRUE), 15, 10) mat[ mat > 45 ] <- NA # create some missing values mat[ 9, ] <- NA # station 9's data is completely missing mat

find.stats <- function( data, threshold ){

```   n      <- length(threshold)
excess <- numeric( n )
out    <- matrix( ncol=nrow(data), nrow=(n + 2) ) # initialise
good   <- which( apply( data, 1, function(x) !all(is.na(x)) ) )
```
# rows that are not completely missing

out[ ,good ] <- apply( data[ good, ], 1, function(x){

```     m <- max( x, na.rm=T )
# determine maximum value per row
c <- length(x[!is.na(x)])
# determine number of non-missing values
for(i in 1:n){ excess[i] <- sum( x > threshold[i], na.rm=TRUE
)/length(x[!is.na(x)]) }
# calc proportion of non-missing values over multiple thresholds
return( c(m, c, excess) )
```

} )

rownames(out) <- c( "TmpMax", "Count", paste("Over", threshold, sep="") )    colnames(out) <- rownames(data) # name of the stations    return( t(out) )
}

lstTemps=c(37,39,41,43)
tmp <- find.stats( mat, lstTemps )
tmp

>>>>>>