[Rd] Inconsistencies in subassignment with NA index. (PR#7210)

From: <ripley_at_stats.ox.ac.uk>
Date: Sat 04 Sep 2004 - 02:34:46 EST


Apart from the inconsistencies, there are two clear bugs here:

  1. miscalculating the number of values needed, in the matrix case. E.g.

> AA[idx, 1] <- B[1:4]

Error in "[<-"(`*tmp*`, idx, 1, value = B[1:4]) :

        number of items to replace is not a multiple of replacement length

although only 4 values are replaced by AA[idx, 1] <- B.

2) the behaviour of the 3D case.

[I will copy a version of this to R-bugs: please be careful when you reply to only copy to R-bugs a version with a PR number in the subject.]

On Fri, 3 Sep 2004, Yao, Minghua wrote:

> I found a difference between the indexing of an array and that of a
> matrix when there are NA's in the index array. The screen copy is as
> follows.
>
> > A <- array(NA, dim=6)
> > A
> [1] NA NA NA NA NA NA

> > idx <- c(1,NA,NA,4,5,6)
> > B <- c(10,20,30,40,50,60)
> > A[idx] <- B
> > A
> [1] 10 NA NA 40 50 60
> > AA <- matrix(NA,6,1)
> > AA
> [,1]
> [1,] NA
> [2,] NA
> [3,] NA
> [4,] NA
> [5,] NA
> [6,] NA
> > AA[idx,1] <- B
> > AA
> [,1]
> [1,] 10
> [2,] NA
> [3,] NA
> [4,] 20
> [5,] 30
> [6,] 40
> >
> In the case of a array, we miss the elements (20 and 30) in B
> corresponding to the NA's in the index array. In the case of a matrix,
> 20 and 30 are assigned to the elements indexed by the indexes following
> the NA's. Is this a reasonable behavior. Thanks in advance for
> explanation.

A is a 1D array but it behaves just like a vector. Wierder things happen with multi-dimensional arrrays

> A <- array(NA, dim=c(6,1,1))
> A[idx,1,1] <- B
> A

, , 1

     [,1]

[1,]   10
[2,]   NA
[3,]   NA
[4,]   NA
[5,]   NA
[6,]   NA

One problem with what happens for matrices is that

> idx <- c(1,4,5,6)
> AA <- matrix(NA,6,1)
> AA[idx,1] <- B

Error in "[<-"(`*tmp*`, idx, 1, value = B) :

        number of items to replace is not a multiple of replacement length

is an error, so it is not counting the values consistently.

The only discussion I could find (Blue Book p.103, which is also discussing LHS subscripting) just says

        If a subscript is NA, an NA is returned.

S normally does not use up values when encountering an NA in an index set, although it does for logical matrix indexing of data frames.

I can see two possible interpretations.

  1. The NA indicates the values was lost after assignment. We don't know what index the first NA was, so 20 got assigned somewhere. And as we don't know where, all the elements had better be NA. However, that is unless the NA was 0, when no assignment took place any no value was used.
  2. The NA indicates the value was lost before assignment, so no assignment took place and no value was used.

R does neither of those. I suspect the correct course of action is to ban NAs in subscripted assignments.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Sat Sep 04 03:18:15 2004

This archive was generated by hypermail 2.1.8 : Fri 18 Mar 2005 - 09:00:04 EST