From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Wed 25 Jan 2006 - 09:30:58 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jan 25 10:54:55 2006

Date: Wed 25 Jan 2006 - 09:30:58 EST

Note that that assumes that all occurrences of a value are contiguous.

On 1/24/06, Ray Brownrigg <ray@mcs.vuw.ac.nz> wrote:

> There's an even faster one, which nobody seems to have mentioned yet:

*>
**> rep(l <- rle(ids)$lengths, l)
**>
**> Timing on my 2.8GHz NetBSD system shows:
**>
**> > length(ids)
**> [1] 45150
**> > # Gabor:
**> > system.time(for (i in 1:100) ave(as.numeric(factor(ids)), ids, FUN =
**> length))
**> [1] 3.45 0.06 3.54 0.00 0.00
**> > # Barry (and others I think):
**> > system.time(for (i in 1:100) table(ids)[ids])
**> [1] 2.13 0.05 2.20 0.00 0.00
**> > Me:
**> > system.time(for (i in 1:100) rep(l <- rle(ids)$lengths, l))
**> [1] 1.60 0.00 1.62 0.00 0.00
**>
**> Of course the difference between 21 milliseconds and 16 milliseconds is
**> not great, unless you are doing this a lot.
**>
**> Ray Brownrigg
**>
**> > From: Gabor Grothendieck <ggrothendieck@gmail.com>
**> >
**> > Nice. I timed it and its much faster than mine too.
**> >
**> > On 1/24/06, Barry Rowlingson <B.Rowlingson@lancaster.ac.uk> wrote:
**> > > Laetitia Marisa wrote:
**> > > > Hello,
**> > > >
**> > > > Is there a simple and fast function that returns a vector of the number
**> > > > of replications for each object of a vector ?
**> > > > For example :
**> > > > I have a vector of IDs :
**> > > > ids <- c( "ID1", "ID2", "ID2", "ID3", "ID3","ID3", "ID5")
**> > > >
**> > > > I want the function returns the following vector where each term is the
**> > > > number of replicates for the given id :
**> > > > c( 1, 2, 2, 3,3,3,1 )
**> > >
**> > > One-liner:
**> > >
**> > > > table(ids)[ids]
**> > > ids
**> > > ID1 ID2 ID2 ID3 ID3 ID3 ID5
**> > > 1 2 2 3 3 3 1
**> > >
**> > > 'table(ids)' computes the counts, then the subscripting [ids] looks it
**> > > all up.
**> > >
**> > > Now try it on your 40,000-long vector!
**> > >
**> > > Barry
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jan 25 10:54:55 2006

*
This archive was generated by hypermail 2.1.8
: Wed 25 Jan 2006 - 14:11:08 EST
*