# Re: [R] Tagging identical rows of a matrix

From: Prof Brian Ripley (ripley@stats.ox.ac.uk)
Date: Sat 15 May 2004 - 05:23:43 EST

```Message-id: <Pine.LNX.4.44.0405142020120.22985-100000@gannet.stats>

```

The trick is to collapse the rows, as Andy Liaw pointed out and
unique.matrix (and .data.frame) does. Once you have the collapsed rows as
character vectors, unique and match will do a fast job (via internal
hashing). (Andy's solution via factor() is the same thing with a bit of
extra baggage.)

On 14 May 2004, Douglas Bates wrote:

> Scott Waichler <scott.waichler@pnl.gov> writes:
>
> > I would like to generate a vector having the same length
> > as the number of rows in a matrix. The vector should contain
> > an integer indicating the "group" of the row, where identical
> > matrix rows are in a group, and a unique row has a unique integer.
> > Thus, for
> >
> > a <- c(1,2)
> > b <- c(1,3)
> > c <- c(1,2)
> > d <- c(1,2)
> > e <- c(1,3)
> > f <- c(2,1)
> > mat <- rbind(a,b,c,d,e,f)
> >
> > I would like to get the vector c(1,2,1,1,2,3). I know dist() gives
> > part of the answer, but I can't figure out how to use it for
> > this purpose without doing a lot of looping. I need to apply this
> > to matrices up to ~100000 rows.
>
> I believe you want to start with unique which, when applied to a
> matrix, provides the unique rows.
>
> > unique(mat)
> [,1] [,2]
> a 1 2
> b 1 3
> f 2 1
>
> I'm sure others will be able to provide clever ways of doing the
> matching against the unique rows.

```--
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
```

This archive was generated by hypermail 2.1.3 : Mon 31 May 2004 - 23:05:11 EST