[Rd] suggesting a new feature for unique()

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Fri 20 Aug 2004 - 05:00:40 EST


Dear R-devel,

May I suggest that a new feature be added to a couple of unique() methods? Sometimes it's useful to have the indices of the original data that the unique elements come from, so that the original data can be recreated from the unique()ed data. I suggest that an `index' argument be added for unique. Below is a suggested patch against R/src/library/base/R/duplicated.R:

  ## NB unique.default is used by factor to avoid unique.matrix,   ## so it needs to handle some other cases. ! unique.default <- function(x, incomparables = FALSE, ...)
{

      if(!is.logical(incomparables) || incomparables)
  	.NotYetUsed("incomparables != FALSE")
      z <- .Internal(unique(x))
      if(is.factor(x))
! 	factor(z, levels = seq(len=nlevels(x)), labels = levels(x),
!                ordered = is.ordered(x))
      else if(inherits(x, "POSIXct") || inherits(x, "Date"))
!         structure(z, class=class(x))
!     else z

  }   

  unique.data.frame <- function(x, incomparables = FALSE, ...) --- 34,51 ----   

  ## NB unique.default is used by factor to avoid unique.matrix,   ## so it needs to handle some other cases. ! unique.default <- function(x, incomparables = FALSE, index=FALSE, ...)
{

      if(!is.logical(incomparables) || incomparables)
  	.NotYetUsed("incomparables != FALSE")
      z <- .Internal(unique(x))
      if(is.factor(x))
! 	z <- factor(z, levels = seq(len=nlevels(x)), labels = levels(x),
!                     ordered = is.ordered(x))
      else if(inherits(x, "POSIXct") || inherits(x, "Date"))
!         z <- structure(z, class=class(x))
!     if (index) attr(z, "index") <- match(x, z)
!     z

  }   

  unique.data.frame <- function(x, incomparables = FALSE, ...)


  unique.matrix <- unique.array <-
! function(x, incomparables = FALSE , MARGIN = 1, ...)
{

      if(!is.logical(incomparables) || incomparables)
  	.NotYetUsed("incomparables != FALSE")
--- 56,62 ----
  }   

  unique.matrix <- unique.array <-
! function(x, incomparables = FALSE , MARGIN = 1, index=FALSE, ...)
{

      if(!is.logical(incomparables) || incomparables)
  	.NotYetUsed("incomparables != FALSE")

An example usage:

> x <- sample(5, 10, rep=T)
> x
 [1] 4 2 5 3 2 3 4 2 2 3
> z <- unique(x, index=TRUE)
> z[attr(z, "index")] == x
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

> x <- factor(x)
> z <- unique(x, index=TRUE)
> z

[1] 4 2 5 3
Levels: 2 3 4 5
> z[attr(z, "index")] == x
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE [I have not tried adding the same thing for the unique.data.frame method, but that shouldn't be too hard...]

Best,
Andy

Andy Liaw, PhD

Biometrics Research      PO Box 2000, RY33-300     
Merck Research Labs           Rahway, NJ 07065
mailto:andy_liaw@merck.com        732-594-0820

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri Aug 20 05:03:56 2004

This archive was generated by hypermail 2.1.8 : Wed 03 Nov 2004 - 22:45:07 EST