Date: Wed 25 Jan 2006 - 08:32:32 EST

There's an even faster one, which nobody seems to have mentioned yet:

rep(l <- rle(ids)$lengths, l)

Timing on my 2.8GHz NetBSD system shows:

> length(ids)

[1] 45150

> # Gabor:

> system.time(for (i in 1:100) ave(as.numeric(factor(ids)), ids, FUN =

length))

[1] 3.45 0.06 3.54 0.00 0.00

> # Barry (and others I think):

> system.time(for (i in 1:100) table(ids)[ids])

[1] 2.13 0.05 2.20 0.00 0.00

Me:
> system.time(for (i in 1:100) rep(l <- rle(ids)$lengths, l))

[1] 1.60 0.00 1.62 0.00 0.00

Of course the difference between 21 milliseconds and 16 milliseconds is not great, unless you are doing this a lot.

Ray Brownrigg

From: Gabor Grothendieck <ggrothendieck@gmail.com>
> Nice. I timed it and its much faster than mine too.

On 1/24/06, Barry Rowlingson <B.Rowlingson@lancaster.ac.uk> wrote:
Laetitia Marisa wrote:
Hello,
**> > >
of replications for each object of a vector ?
For example :
I have a vector of IDs :
**> > > I have a vector of IDs :
I want the function returns the following vector where each term is the
**> > >
c( 1, 2, 2, 3,3,3,1 )
**> > > number of replicates for the given id :
**> > > c( 1, 2, 2, 3,3,3,1 )
One-liner:
**> >
**> > > table(ids)[ids]
**> > ids
**> > ID1 ID2 ID2 ID3 ID3 ID3 ID5
**> > 1 2 2 3 3 3 1
'table(ids)' computes the counts, then the subscripting [ids] looks it
all up.
**> > all up.
Now try it on your 40,000-long vector!
**> >
Barry
