Re: [R] Position in a vector of the last value > n - *SOLVED*

From: Thaden, John J <ThadenJohnJ_at_uams.edu>
Date: Sat, 12 Jul 2008 16:00:05 -0500

Yes, your version (func2) is quick, quickest for longer vectors:
> m <- matrix(rexp(6e6,rate=0.05), nrow=50000) # 120 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>cut))
> func2 <- function(v,cut=20) {

+    x <- which(v>cut)
+    x[length(x)]
+ }

> func3 <- function(v,cut=20) tail(which(v>cut), 1)
> system.time(apply(m, 2, func1))

   user system elapsed
   0.58 0.01 0.59
> system.time(apply(m, 2, func2))

   user system elapsed
   0.48 0.04 0.53
> system.time(apply(m, 2, func3))

   user system elapsed
   0.55 0.00 0.56
-John Thaden

-----Original Message-----
From: jim holtman [mailto:jholtman_at_gmail.com] Sent: Saturday, July 12, 2008 6:56 AM
To: Thaden, John J
Cc: r-help_at_r-project.org
Subject: Re: [R] Position in a vector of the last value > n - *SOLVED*

A slight modification gives the equivalent results instead of using 'tail'

> m <- matrix(rexp(6e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>20))
> func2 <- function(v,cut=20) {

+     x <- which(v>20)
+     x[length(x)]
+ }

> system.time(apply(m, 2, func1))

   user system elapsed
   1.33 0.05 1.47
> # user system elapsed
> # 0.40 0.02 0.42
> system.time(apply(m, 2, func2))

   user system elapsed
   1.31 0.08 1.44
> # user system elapsed
> # 0.70 0.05 0.75
>

Here is another view using Rprof on the first version. You can see that 'tail' takes a fair amount of time; accounts for the differences in timing:

/cygdrive/c: perl perf/bin/readrprof.pl tempxx.txt   0 2.7 root

  1. 1.8 system.time
  2. . 1.7 eval
  3. . . 1.7 eval
  4. . . . 1.7 apply
  5. . . . | 1.5 FUN
  6. . . . | . 0.8 tail
  7. . . . | . . 0.5 which
  8. . . . | . . . 0.1 &
  9. . . . | . . . 0.0 >
  10. . . . | . . . 0.0 !
  11. . . . | . . 0.3 tail.default
  12. . . . | . . . 0.2 stopifnot
  13. . . . | . . . . 0.1 eval
  14. . . . | . . . . 0.0 match.call
  15. . . . | . . . . 0.0 any
  16. . . . | . 0.5 which
  17. . . . | . . 0.1 &
  18. . . . | . . 0.1 >
  19. . . . | . . 0.0 names<-
  20. . . . | . . 0.0 is.na
  21. . . . | 0.1 aperm
  22. . . . | 0.0 unlist
  23. . . . | . 0.0 lapply
  24. . . . | 0.0 is.null
  25. . 0.1 gc
  26. 0.8 matrix
  27. . 0.7 as.vector
  28. . . 0.6 rexp
  29. 0.1 < /cygdrive/c:

On Fri, Jul 11, 2008 at 12:23 PM, Thaden, John J <ThadenJohnJ_at_uams.edu> wrote:
> I had written asking for a simple way to extract the
> Index of the last value in a vector greater than some
> cutoff, e.g., the index, 6, for a cutoff of 20 and this
> example vector:
>
> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>
> Thank you, Alain Guillet, for this simple solution sent
> to me offlist:
>
> max(which(v > 20)
>
> Also, thank you Lisa Readdy for a lengthier solution.
>
> Other offerings yielded the value instead of the index
> (the phrasing of my question apparently was misleading):
>
> v[max(which(v > 20))] (Henrique Dallazuanna)
>
> tail(v[v>20],1) (Jim Holtman)
>
> Jim's use of tail() suggests a variant to Alain's
> solution
>
> tail(which(v > 20), 1)
>
> This is faster than the max() version with long vectors,
> but, to my surprise, slower (on my WinXP Lenovo T61 laptop)
> in a rough mockup of my column-wise apply() usage:
>
> m <- matrix(rexp(3e6,rate=0.05), nrow=600) # 5,000 cols
> m[m<20] <- 20
> func1 <- function(v,cut=20) max(which(v>20))
> func2 <- function(v,cut=20) tail(which(v>20),1)
> system.time(apply(m, 2, func1))
> # user system elapsed
> # 0.40 0.02 0.42
> system.time(apply(m, 2, func2))
> # user system elapsed
> # 0.70 0.05 0.75
>
> Thank you again, Alain and others.
> John
>
> ----------------
>
> On Thu, Jul 10, 2008 at 9:41 AM, John Thaden wrote:
>> This shouldn't be hard, but it's just not
>> coming to me:
>> Given a vector, e.g.,
>> v <- c(20, 134, 45, 20, 24, 500, 20, 20, 20)
>> how can I get the index of the last value in
>> the vector that has a value greater than n, in
>> the example, with n > 20? I'm looking for
>> an efficient function I can use on very large
>> matrices, as the FUN argument in the apply()
>> command.
>
> Confidentiality Notice: This e-mail message, including a...{{dropped:8}}
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Confidentiality Notice: This e-mail message, including a...{{dropped:8}}

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 13 Jul 2008 - 09:41:49 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 13 Jul 2008 - 11:31:54 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive