Re: [R] NA and logical indexes

From: Ted Harding <Ted.Harding_at_manchester.ac.uk>
Date: Fri, 28 Nov 2008 22:01:15 +0000 (GMT)


On 28-Nov-08 21:25:36, Sebastian P. Luque wrote:
> Hi,
> I vaguely remember this issue being discussed at some length in the
> past, but am having trouble relocating the proper thread (defining an
> adequate search string to do so):
>
> ---<---------------cut here---------------start-------------->---
> R> foo <- data.frame(A=gl(2, 5, labels=letters[1:2]), X=runif(10))
> R> foo$A[1] <- NA
> R> foo$A == "b"
> [1] NA FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
> R> foo$A[foo$A == "b"]
> [1] <NA> b b b b b
> Levels: a b
> R> foo$X[foo$A == "b"]
> [1] NA 0.4425 0.7164 0.3171 0.1967 0.8300
> R> foo[foo$A == "b", ]
> A X
> NA <NA> NA
> 6 b 0.4425
> 7 b 0.7164
> 8 b 0.3171
> 9 b 0.1967
> 10 b 0.8300
> ---<---------------cut here---------------end---------------->---
>
> Why is foo$X[1] set to NA in that last call?
>
> Cheers,
> Seb

It is not! In my repetition (which has different runifs):

  foo[foo$A == "b", ]

#       A         X
# NA <NA>        NA
# 6     b 0.2300618
# 7     b 0.5109791
# 8     b 0.7947862
# 9     b 0.3400228
# 10    b 0.5464989
  foo
#       A         X
# 1  <NA> 0.5013591
# 2     a 0.4475963
# 3     a 0.2600449
# 4     a 0.9240698
# 5     a 0.4205284
# 6     b 0.2300618
# 7     b 0.5109791
# 8     b 0.7947862
# 9     b 0.3400228

# 10 b 0.5464989

NA can seem to have a bewildering logic, but it all becomes clear if you interpret NA as "value unkown".

You asked for foo[foo$A == "b", ]. What happens is that when the test foo$A == "b" encounters f$A[1] it sees NA, so it does not know what the value is. Hence it does not know whether this row of foo satisfies the test. Hence the entire row is of unkown status. Hence a row is output all of whose elements (including the row label, i.e. the row number) are flagged "unknown", i.e. NA.

AFter all, if it gave the value of foo$X[1] = 0.5013591, and you subsequently acessed foo[foo$A == "b",][1,2] and got 0.5013591, you would presumably proceed as though this was a value corresponding to a case where foo$A == "b". But it is not -- since foo$A[1] = NA, you don't know whether that is the case. Hence you don't know the value of foo[foo$A == "b",][1,2].

Clear? ( :))
Hoping this helps,
Ted.



E-Mail: (Ted Harding) <Ted.Harding_at_manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 28-Nov-08                                       Time: 22:01:11
------------------------------ XFMail ------------------------------

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 28 Nov 2008 - 22:05:01 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 29 Nov 2008 - 00:30:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive