Re: [R] Replacing for loop with tapply!?

From: Dimitris Rizopoulos <dimitris.rizopoulos_at_med.kuleuven.be>
Date: Sat 11 Jun 2005 - 03:10:32 EST

for the maximum you could use something like:

ind[, 1] <- apply(mat, 2, max)

I hope it helps.

Best,
Dimitris

```
Original Message ----- From: "Sander Oom" <slist@oomvanlieshout.net> To: "Dimitris Rizopoulos" <dimitris.rizopoulos@med.kuleuven.be> Cc: <r-help@stat.math.ethz.ch> Sent: Friday, June 10, 2005 12:10 PM Subject: Re: [R] Replacing for loop with tapply!?

> Thanks Dimitris,
>
> Very impressive! Much faster than before.
>
> Thanks to new found R.basic, I can simply rotate the result with
> rotate270{R.basic}:
>
> > mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
> > temps <- c(37, 39, 41)
> > #################
> > #ind <- matrix(0, length(temps), ncol(mat))
> > ind <- matrix(0, 4, ncol(mat))
> > (startDate <- date())
> [1] "Fri Jun 10 12:08:01 2005"
> > for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
> > ind[4, ] <- colMeans(max(mat))
> Error in colMeans(max(mat)) : 'x' must be an array of at least two
> dimensions
> > (endDate <- date())
> [1] "Fri Jun 10 12:08:02 2005"
> > ind <- rotate270(ind)
> > ind[1:10,]
> V4 V3 V2 V1
> 1 0 56 75 80
> 2 0 46 53 60
> 3 0 50 58 67
> 4 0 60 72 80
> 5 0 59 68 76
> 6 0 55 67 74
> 7 0 62 77 93
> 8 0 45 57 67
> 9 0 57 68 75
> 10 0 61 66 76
>
> However, I have not managed to get the row maximum using your
> method? It
> should be 50 for most rows, but my first guess code gives an error!
>
> Any suggestions?
>
> Sander
>
>
>
> Dimitris Rizopoulos wrote:
>> maybe you are looking for something along these lines:
>>
>> mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
>> temps <- c(37, 39, 41)
>> #################
>> ind <- matrix(0, length(temps), ncol(mat))
>> for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
>> ind
>>
>>
>> I hope it helps.
>>
>> Best,
>> Dimitris
>>
>> ----
>>
>>
>> ----- Original Message -----
>> From: "Sander Oom" <slist@oomvanlieshout.net>
>> To: <r-help@stat.math.ethz.ch>
>> Sent: Friday, June 10, 2005 10:50 AM
>> Subject: [R] Replacing for loop with tapply!?
>>
>>
>>>Dear all,
>>>
>>>We have a large data set with temperature data for weather stations
>>>across the globe (15000 stations).
>>>
>>>For each station, we need to calculate the number of days a certain
>>>temperature is exceeded.
>>>
>>>So far we used the following S code, where mat88 is a matrix
>>>containing
>>>rows of 365 daily temperatures for each of 15000 weather stations:
>>>
>>>m <- 37
>>>n <- 2
>>>outmat88 <- matrix(0, ncol = 4, nrow = nrow(mat88))
>>>for(i in 1:nrow(mat88)) {
>>># i <- 3
>>>row1 <- as.data.frame(df88[i, ])
>>>temprow37 <- select.rows(row1, row1 > m)
>>>temprow39 <- select.rows(row1, row1 > m + n)
>>>temprow41 <- select.rows(row1, row1 > m + 2 * n)
>>>outmat88[i, 1] <- max(row1, na.rm = T)
>>>outmat88[i, 2] <- count.rows(temprow37)
>>>outmat88[i, 3] <- count.rows(temprow39)
>>>outmat88[i, 4] <- count.rows(temprow41)
>>>}
>>>outmat88
>>>
>>>We have transferred the data to a more potent Linux box running R,
>>>but
>>>still hope to speed up the code.
>>>
>>>I know a for loop should be avoided when looking for speed. I also
>>>know
>>>the answer is in something like tapply, but my understanding of
>>>these
>>>commands is still to limited to see the solution. Could someone
>>>show
>>>me
>>>the way!?
>>>
>>>
>>>Sander.
>>>--
>>>--------------------------------------------
>>>
>>>
>>
>>
>
>
>
