# Re: [R] Replacing for loop with tapply!?

From: Kjetil Brinchmann Halvorsen <kjetil_at_acelerate.com>
Date: Sat 11 Jun 2005 - 01:55:53 EST

Sander Oom wrote:

>Dear all,
>
>We have a large data set with temperature data for weather stations
>across the globe (15000 stations).
>
>For each station, we need to calculate the number of days a certain
>temperature is exceeded.
>
>So far we used the following S code, where mat88 is a matrix containing
>rows of 365 daily temperatures for each of 15000 weather stations:
>
> m <- 37
> n <- 2
> outmat88 <- matrix(0, ncol = 4, nrow = nrow(mat88))
> for(i in 1:nrow(mat88)) {
> # i <- 3
> row1 <- as.data.frame(df88[i, ])
> temprow37 <- select.rows(row1, row1 > m)
> temprow39 <- select.rows(row1, row1 > m + n)
> temprow41 <- select.rows(row1, row1 > m + 2 * n)
> outmat88[i, 1] <- max(row1, na.rm = T)
> outmat88[i, 2] <- count.rows(temprow37)
> outmat88[i, 3] <- count.rows(temprow39)
> outmat88[i, 4] <- count.rows(temprow41)
> }
> outmat88
>
>
>
What you need is not tapply but apply. Something like

apply(mat88, 1, function(x) sum(x > 30))

where your treshold should replace 30 and the `1' refers to rows. For multiple tresholds:

apply(mat88, 1, function(x) c( sum(x>20), sum(x>25), sum(x>30)))

Kjetil

>We have transferred the data to a more potent Linux box running R, but
>still hope to speed up the code.
>
>I know a for loop should be avoided when looking for speed. I also know
>the answer is in something like tapply, but my understanding of these
>commands is still to limited to see the solution. Could someone show me
>the way!?
>
>
>Sander.
>
>

```--

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
--  Mahdi Elmandjra

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help