Re: [Rd] sweep sanity checking?

From: Petr Savicky <savicky_at_cs.cas.cz>
Date: Fri, 13 Jul 2007 22:37:36 +0200

I would like to suggest a replacement for the curent function sweep based on the two previous suggestions posted at   https://stat.ethz.ch/pipermail/r-help/2005-June/073989.html and
  http://wiki.r-project.org/rwiki/doku.php?id=rdoc:base:sweep with some extensions.

My suggestion is to use one of the following two variants. They differ in whether length(STATS) == 1 is allowed without a warning in the stricter (default) check or not. I don't know, what is better.

  sweep1 <- function (x, MARGIN, STATS, FUN = "-", check.margin=TRUE, ...)   {

      FUN <- match.fun(FUN)
      dims <- dim(x)
      dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
      if (length(MARGIN) < length(dimstat)) {
          STATS <- drop(STATS)
          dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
      }
      if (check.margin && length(STATS) > 1 &&
          (length(dimstat)!=length(MARGIN) || any(dims[MARGIN]!=dimstat))) {
          warning("length(STATS) or dim(STAT) do not match dim(x)[MARGIN]")
      } else if (prod(dims[MARGIN]) %% length(STATS)!=0)
          warning("prod(dim(x)[MARGIN]) is not a multiple of length(STATS)")
      perm <- c(MARGIN, (1:length(dims))[-MARGIN])
      FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
  }

  sweep2 <- function (x, MARGIN, STATS, FUN = "-", check.margin=TRUE, ...)   {

      FUN <- match.fun(FUN)
      dims <- dim(x)
      dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
      if (length(MARGIN) < length(dimstat)) {
          STATS <- drop(STATS)
          dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
      }
      if (check.margin &&
          (length(dimstat)!=length(MARGIN) || any(dims[MARGIN]!=dimstat))) {
          warning("length(STATS) or dim(STAT) do not match dim(x)[MARGIN]")
      } else if (prod(dims[MARGIN]) %% length(STATS)!=0)
          warning("prod(dim(x)[MARGIN]) is not a multiple of length(STATS)")
      perm <- c(MARGIN, (1:length(dims))[-MARGIN])
      FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
  }

The functions are tested on the following examples.

  a <- array(1:64,dim=c(4,4,4))
  M <- c(1,3);

  b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)

  a[,2,]     -             -             -                 - 
  a[1:2,2,]  warning       warning       -                 - 
  1          -             warning       -                 -
  1:3        warning       warning       warning           warning
  1:16       warning       warning       -                 -

  a <- matrix(1:8,nrow=4,ncol=2);
  M <- 1;

  b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)

  1          -             warning       -                 -
  1:2        warning       warning       -                 -
  1:3        warning       warning       warning           warning
  1:4        -             -             -                 -
  matrix(1:4,nrow=1) # (A)

- - - -
matrix(1:4,ncol=1) # (B)
- - - -

  a <- matrix(1:8,nrow=4,ncol=2);
  M <- 2;

  b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)

  1          -             warning       -                 -
  1:2        -             -             -                 -
  1:3        warning       warning       warning           warning
  1:4        warning       warning       warning           warning
  matrix(1:2,ncol=1) # (A)

- - - -
matrix(1:2,nrow=1) # (B)
- - - -

Note that cases marked (A) do not generate a warning, although they should. This is the cost of using drop(STATS), which allows cases marked (B) without a warning. In my opinion, cases (B) make sense.

Reproducing the tests is possible using the script   http://www.cs.cas.cz/~savicky/R-devel/verify_sweep.R

Petr.



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Fri 13 Jul 2007 - 20:52:04 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Wed 25 Jul 2007 - 08:36:54 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.