# Re: [Rd] duplicates() function

From: Joshua Ulrich <josh.m.ulrich_at_gmail.com>
Date: Fri, 08 Apr 2011 10:39:01 -0500

On Fri, Apr 8, 2011 at 10:15 AM, Duncan Murdoch <murdoch.duncan_at_gmail.com> wrote:
> On 08/04/2011 11:08 AM, Joshua Ulrich wrote:
>>
>>
>> y<- rep(NA,length(x))
>> y[duplicated(x)]<- match(x[duplicated(x)] ,x)
>
> That's a nice solution for vectors.  Unfortunately for me, I have a matrix
> (which duplicated() handles by checking whole rows).  So a better example
> that I should have posted would be
>
> x <-  cbind(1, c(9,7,9,3,7) )
>
> and I'd still like the same output
>
For a matrix, could you apply the same strategy used in duplicated()?

y <- rep(NA,NROW(x))
temp <- apply(x, 1, function(x) paste(x, collapse="\r")) y[duplicated(temp)] <- match(temp[duplicated(temp)], temp)

>>  duplicated(x)
>
> [1] FALSE FALSE  TRUE FALSE TRUE
>
>>  duplicates(x)
>
> [1] NA NA  1 NA  2
>
>
> Duncan Murdoch
>
>> --
>>
>>
>>
>> On Fri, Apr 8, 2011 at 9:59 AM, Duncan Murdoch<murdoch.duncan_at_gmail.com>
>>  wrote:
>> >  I need a function which is similar to duplicated(), but instead of
>> > returning
>> >  TRUE/FALSE, returns indices of which element was duplicated.  That is,
>> >
>> >>  x<- c(9,7,9,3,7)
>> >>  duplicated(x)
>> >  [1] FALSE FALSE  TRUE FALSE TRUE
>> >
>> >>  duplicates(x)
>> >  [1] NA NA  1 NA  2
>> >
>> >  (so that I know that element 3 is a duplicate of element 1, and element
>> > 5 is
>> >  a duplicate of element 2, whereas the others were not duplicated
>> > according
>> >  to our definition.)
>> >
>> >  Is there a simple way to write this function?  I have  an ugly
>> >  implementation in R that loops over all the values; it would make more
>> > sense
>> >  to redo it in C, if there isn't a simple implementation I missed.
>> >
>> >  Duncan Murdoch
>> >
>> >  ______________________________________________
>> >  R-devel_at_r-project.org mailing list
>> >  https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>
>

R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 08 Apr 2011 - 15:42:12 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 11 Apr 2011 - 18:20:44 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.