[R] Systematic treatment of missing values

From: David Soloveichik <dsolov_at_caltech.edu>
Date: Sun 28 May 2006 - 16:19:02 EST


I am wondering whether there is a well-accepted approach to handling missing values (NA's) in a programming language such as R. For example, most functions seem to propagate NA to the output when the value of the missing entry could have mattered. In other words, most functions are not willing to "take a stand" on what the missing value was. However, some functions don't seem to do this. For example,

> c(1,2,3,NA) %in% c(2,3)

[1] FALSE TRUE TRUE FALSE rather than: FALSE TRUE TRUE NA

Also, what is the logic of the following:
> c(1,2,3,NA) %in% c(2,3,NA)

[1] FALSE TRUE TRUE TRUE Why is the last output value TRUE? Why does R claim that the NA on the left hand side of %in% is the same as the NA on the right hand side of %in%?

Thanks a lot,
David



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun May 28 16:23:47 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 28 May 2006 - 20:10:22 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.