Re: [R] Second largest element from each matrix row

From: William Dunlap <wdunlap_at_tibco.com>
Date: Tue, 26 Apr 2011 09:11:17 -0700

A different approach is to use order() to sort first by row number and then break the ties by value. It is quick when there are lots of short rows.

> f1 <- function (x)
+ apply(x, 1, function(row) sort(row, decreasing = TRUE)[2])
> f2 <- function (x)
+ -apply(-x, 1, function(row) sort.int(row, partial = 2)[2])
> f3 <- function (x)

+ {   
+     # order by row number then by value
+     y <- t(x)
+     array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ]
+ }

> f4 <- function (x)
+ apply(x, 1, function(row) max(row[-which.max(row)]))

> x <- matrix(runif(1e5*6), nrow=1e5)
> library(rbenchmark)
> benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),
+ replications=5, columns=c("test","replications","elapsed"), order="elapsed")
         test replications elapsed
3 r3 <- f3(x)            5    1.08
4 r4 <- f4(x)            5   12.59
2 r2 <- f2(x)            5   23.19
1 r1 <- f1(x)            5   59.54

> identical(r1,r2) && identical(r1, r3) && identical(r1, r4)
[1] TRUE Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces_at_r-project.org
> [mailto:r-help-bounces_at_r-project.org] On Behalf Of peter dalgaard
> Sent: Tuesday, April 26, 2011 8:13 AM
> To: David Winsemius
> Cc: r-help_at_r-project.org
> Subject: Re: [R] Second largest element from each matrix row
>
>
> On Apr 26, 2011, at 14:36 , David Winsemius wrote:
>
> >
> > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote:
> >
> >> Hi,
> >>
> >> I need to extract the second largest element from each row of a
> >> matrix. Below is my solution, but I think there should be
> a more efficient
> >> way to accomplish the same, or not?
> >>
> >>
> >> set.seed(1)
> >> a <- matrix(rnorm(9), 3 ,3)
> >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,])
> >> ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]])
> >> ans
> >
> > There are probably many but this one is reasonably compact,
> one-step, and readable:
> >
> > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1])
> > > ans2
> >
> > Refreshing my mail client proves I was right about many
> solutions, but this is the first (so far) to use the dim attribute.
>
> Anything with sort() or order() will have complexity
> O(n*log(n)) or worse (n is the number of columns), whereas
> finding the k-th largest element has complexity O(k*n).
>
> For moderate n, this may be unimportant, but you could
> potentially find a speedup using
>
> sort.int(i, decreasing=TRUE, partial=2)[2]
>
> or
>
> max(i[-which.max(i)])
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes_at_cbs.dk Priv: PDalgd_at_gmail.com
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Tue 26 Apr 2011 - 16:15:59 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 26 Apr 2011 - 16:50:33 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive