# RE: [R] Tagging identical rows of a matrix

From: Waichler, Scott R (Scott.Waichler@pnl.gov)
Date: Sat 15 May 2004 - 06:12:08 EST

```Message-id: <62AE0CF1D4875C4BBDEC29DB9924ACE87F21DB@pnlmse25.pnl.gov>

```

Thanks to all of you who responded to my help request.

> mat2 <- apply(mat, 1, paste, collapse=":")
> vec <- match(mat2, unique(mat2))
> vec
[1] 1 2 1 1 2 3

P.S. I found that Andy Liaw's method didn't preserve the
index order that I wanted; it yields

2 3 2 2 3 1

To get the order of integers I was looking for required an
invocation of unique:

as.numeric(factor(apply(mat, 1, paste, collapse=":"),
levels=unique(apply(mat, 1, paste, collapse=":"))))

But the first method above is obviously cleaner and is twice
as fast, only 9 seconds for a 100000 row matrix on an ordinary PC.

Regards,
Scott Waichler

> > I would like to generate a vector having the same length
> > as the number of rows in a matrix. The vector should contain an
> > integer indicating the "group" of the row, where identical
> matrix rows
> > are in a group, and a unique row has a unique integer. Thus, for
> >
> > a <- c(1,2)
> > b <- c(1,3)
> > c <- c(1,2)
> > d <- c(1,2)
> > e <- c(1,3)
> > f <- c(2,1)
> > mat <- rbind(a,b,c,d,e,f)
> >
> > I would like to get the vector c(1,2,1,1,2,3). I know dist() gives
> > part of the answer, but I can't figure out how to use it for this
> > purpose without doing a lot of looping. I need to apply this to
> > matrices up to ~100000 rows.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help