From: Petr Savicky <savicky_at_cs.cas.cz>

Date: Wed, 25 Jul 2007 09:03:21 +0200

are accepted without warning. On the other hand, if k1 != k2, then, for example, dim(STATS)= k2, dim(STATS) = c(k2,k1) generate a warning, although the simple divisibility condition

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 25 Jul 2007 - 07:09:37 GMT

Date: Wed, 25 Jul 2007 09:03:21 +0200

I would like to suggest a patch against R-devel-2007-07-24, which modifies function sweep by including a warning, if dim(STATS) is not consistent with dim(x)[MARGIN]. If check.margin=FALSE, the simple test whether prod(dim(x)[MARGIN]) is a multiple of length(STATS) is performed. If check.margin=TRUE, then a more restrictive test is used, but a limited recycling is still allowed without warning. Besides generating a warning in some situations, there is no other change in the behavior of sweep. The patch is:

- R-devel_2007-07-24/src/library/base/R/sweep.R 2007-01-04 17:51:53.000000000 +0100

+++ R-devel_2007-07-24-sweep/src/library/base/R/sweep.R 2007-07-24 10:56:18.000000000 +0200

@@ -1,7 +1,21 @@ -sweep <- function(x, MARGIN, STATS, FUN = "-", ...)

+sweep <- function(x, MARGIN, STATS, FUN = "-", check.margin=FALSE, ...)

{ FUN <- match.fun(FUN) dims <- dim(x)

*+ if (check.margin) {**+ dimmargin <- dims[MARGIN]**+ dimmargin <- dimmargin[dimmargin > 1]*

+ dimstats <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)

*+ dimstats <- dimstats[dimstats > 1]**+ if (length(dimstats) > length(dimmargin) ||**+ any(dimstats != dimmargin[seq(along.with=dimstats)]))**+ warning("length(STATS) or dim(STATS) do not match dim(x)[MARGIN]")**+ } else if (prod(dims[MARGIN]) %% length(STATS) != 0) {**+ if (length(MARGIN) == 1)**+ warning("dim(x)[MARGIN] is not a multiple of length(STATS)")**+ else**+ warning("prod(dim(x)[MARGIN]) is not a multiple of length(STATS)")**+ }*

perm <- c(MARGIN, (1:length(dims))[ - MARGIN]) FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...) }

The patch uses the default check.margin=FALSE, since this is more backward compatible. Changing the default to check.margin=TRUE would also be fine with me and also with Ben Bolker, who told me this in a separate email.

Let me include more comments on the stricter test. If check.margin=TRUE,
then the patch tests whether (after deleting possible dimensions with
only one level) dim(STATS) is a prefix of dim(x)[MARGIN]. Hence, for example,
if dim(x)[MARGIN] = c(k1,k2), the cases

length(STATS) = 1,

dim(STATS) = k1, dim(STATS) = NULL and length(STATS) = k1, dim(STATS) = c(k1,k2)

are accepted without warning. On the other hand, if k1 != k2, then, for example, dim(STATS)= k2, dim(STATS) = c(k2,k1) generate a warning, although the simple divisibility condition

length(STATS) divides prod(dim(x)[MARGIN]) is satisfied. The warning is generated, since in the last two cases, recycling produces incorrect or at least suspicious result.

In the simplest case, when length(MARGIN)=1 and STATS is a vector, the cases accepted by the stricter test without warning are exactly the following two: length(STATS) = 1, length(STATS) = dim(x)[MARGIN].

I tested the patch using the script

http://www.cs.cas.cz/~savicky/R-devel/verify_sweep1.R
Ben Bolker also tested the patch in his environment.

I appreciate to know the opinion of R core developers on this patch. Thank you in advance.

Petr.

R-devel_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel Received on Wed 25 Jul 2007 - 07:09:37 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 27 Jul 2007 - 07:38:16 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel.
Please read the posting
guide before posting to the list.
*