Re: [R] sweep() and recycling

From: Robin Hankin <r.hankin_at_noc.soton.ac.uk>
Date: Tue 21 Jun 2005 - 23:47:35 EST

Hi

On Jun 21, 2005, at 02:33 pm, Heather Turner wrote:

> I think the warning condition in Robin's patch is too harsh - the
> following examples seem reasonable to me, but all produce warnings
>
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)
>

The examples above do give warnings (as intended) but I think all three cases above
are inimical to the spirit of sweep(): nothing is being "swept" out.

So a warning is appropriate, IMO.

In any case, one can always suppress (or ignore!) a warning if one knows what one is doing. YMMV, but if I wanted to do the above operations I would
replace

sweep(array(0, dim = c(4,3,2)), c(1,3), 1:12, "+" , give.warning = FALSE) with

  aperm(array(1:12,c(4,2,3)),c(1,3,2))

best wishes

rksh

> I have written an alternative (given below) which does not give
> warnings in the above cases, but does warn in the following case
>
>> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
> , , 1
>
> [,1] [,2] [,3]
> [1,] 0 3 6
> [2,] 0 3 9
> [3,] 0 6 9
> [4,] 3 6 9
>
> , , 2
>
> [,1] [,2] [,3]
> [1,] 12 15 18
> [2,] 12 15 21
> [3,] 12 18 21
> [4,] 15 18 21
>
> Warning message:
> STATS does not recycle exactly across MARGIN
>
> The code could be easily modified to warn in other cases, e.g. when
> length of STATS is a divisor of the corresponding array extent (as in
> the first example above, with length(STATS) = 2).
>
> The code also includes Gabor's suggestion.
>
> Heather
>
> sweep <- function (x, MARGIN, STATS, FUN = "-", warn =
> getOption("warn"), ...)
> {
> FUN <- match.fun(FUN)
> dims <- dim(x)
> perm <- c(MARGIN, (1:length(dims))[-MARGIN])
> if (warn >= 0) {
> s <- length(STATS)
> cumDim <- c(1, cumprod(dims[perm]))
> if (s > max(cumDim))
> warning("length of STATS greater than length of array",
> call. = FALSE)
> else {
> upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
> lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
> if (any(upper %% s != 0, s %% lower != 0))
> warning("STATS does not recycle exactly across MARGIN",
> call. = FALSE)
> }
> }
> FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
>
>>>> Gabor Grothendieck <ggrothendieck@gmail.com> 06/21/05 01:25pm >>>
> \
> Perhaps the signature should be:
>
> sweep(...other args go here..., warn=getOption("warn"))
>
> so that the name and value of the argument are consistent with
> the R warn option.
>
> On 6/21/05, Robin Hankin <r.hankin@noc.soton.ac.uk> wrote:
>>
>> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
>>
>>> The issue here is that the equivalent command array(1:5, c(6,6)) (to
>>> matrix(1:5,6,6)) gives no warning, and sweep uses array().
>>>
>>> I am not sure either should: fractional recycling was normally
>>> allowed
>>> in S3 (S4 tightened up a bit).
>>>
>>> Perhaps someone who thinks sweep() should warn could contribute a
>>> tested patch?
>>>
>>
>>
>> OK, modified R code and Rd file below (is this the best way to do
>> this?)
>>
>>
>>
>>
>> "sweep" <-
>> function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
>> {
>> FUN <- match.fun(FUN)
>> dims <- dim(x)
>> if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
>> dim(as.array(STATS)))){
>> warning("array extents do not recycle exactly")
>> }
>> perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>> FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
>> }
>>
>>
>>
>>
>>
>>
>>
>> \name{sweep}
>> \alias{sweep}
>> \title{Sweep out Array Summaries}
>> \description{
>> Return an array obtained from an input array by sweeping out a
>> summary
>> statistic.
>> }
>> \usage{
>> sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
>> }
>> \arguments{
>> \item{x}{an array.}
>> \item{MARGIN}{a vector of indices giving the extents of \code{x}
>> which correspond to \code{STATS}.}
>> \item{STATS}{the summary statistic which is to be swept out.}
>> \item{FUN}{the function to be used to carry out the sweep. In the
>> case of binary operators such as \code{"/"} etc., the function
>> name
>> must be quoted.}
>> \item{give.warning}{Boolean, with default \code{FALSE} meaning to
>> give no warning, even if array extents do not match. If
>> \code{TRUE}, check for the correct dimensions and if a
>> mismatch is detected, give a suitable warning.}
>> \item{\dots}{optional arguments to \code{FUN}.}
>> }
>> \value{
>> An array with the same shape as \code{x}, but with the summary
>> statistics swept out.
>> }
>> \note{
>> If \code{STATS} is of length 1, recycling is carried out with no
>> warning irrespective of the value of \code{give.warning}.
>> }
>>
>> \references{
>> Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
>> \emph{The New S Language}.
>> Wadsworth \& Brooks/Cole.
>> }
>> \seealso{
>> \code{\link{apply}} on which \code{sweep} used to be based;
>> \code{\link{scale}} for centering and scaling.
>> }
>> \examples{
>> require(stats) # for median
>> med.att <- apply(attitude, 2, median)
>> sweep(data.matrix(attitude), 2, med.att)# subtract the column medians
>>
>> a <- array(0, c(2, 3, 4))
>> b <- matrix(1:8, c(2, 4))
>> sweep(a, c(1, 3), b, "+", give.warning = TRUE) # no warning:
>> all(dim(a)[c(1,3)] == dim(b))
>> sweep(a, c(1, 2), b, "+", give.warning = TRUE) # warning given
>>
>> }
>> \keyword{array}
>> \keyword{iteration}
>>
>>
>>
>>
>> --
>> Robin Hankin
>> Uncertainty Analyst
>> National Oceanography Centre, Southampton
>> European Way, Southampton SO14 3ZH, UK
>> tel 023-8059-7743
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
>
>

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue Jun 21 23:52:02 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:56 EST