Re: [R] how to efficiently compute set unique?

From: Duncan Murdoch <>
Date: Mon, 21 Jun 2010 21:18:00 -0400

On 21/06/2010 9:06 PM, G FANG wrote:
> Hi,
> I want to get the unique set from a large numeric k by 1 vector, k is
> in tens of millions
> when I used the matlab function unique, it takes less than 10 secs
> but when I tried to use the unique in R with similar CPU and memory,
> it is not done in minutes
> I am wondering, am I using the function in the right way?
> dim(cntxtn)
> [1] 13584763 1
> uniqueCntxt = unique(cntxtn); # this is taking really long

What type is cntxtn? If I do that sort of thing on a numeric vector, it's quite fast:

 > x <- sample(100000, size=13584763, replace=T)  > system.time(unique(x))
   user system elapsed
   3.61 0.14 3.75 mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. Received on Tue 22 Jun 2010 - 01:21:08 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Tue 22 Jun 2010 - 02:10:34 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive