[Rd] question about differences in behavior with NA subscripts in matrix vs. data.frame

From: Mark Kimpel <mwkimpel_at_gmail.com>
Date: Thu, 03 Dec 2009 15:33:23 -0500


I ran into a problem today when using a conditional for sub-setting a data.frame and tracked it down to a difference in behavior between the treatment of NA when sub-setting matrices and data.frames. A self-contained example is below followed by sessionInfo(). I'm not questioning the documentation of the behavior, but the rationale for its existence.

Could someone explain to me why the difference is logical and useful? This seems more of a devel than a help issue, my apologies if I've posted to the incorrect list.

Mark
#

a.vec <- c("A", "", "B", "DEF", NA, "", NA, "Q")
a.vec[a.vec == ""] <- NA
a.vec

## [1] "A" NA "B" "DEF" NA NA NA "Q"
a.mat <- matrix(rep(c("A", "", "B", "DEF", NA, "", NA, "Q"), 5), nrow = 5, ncol = 8)
a.mat[a.mat[,3] == "", 3] <- NA
a.mat
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] "A" "" "B" "Q" NA "" NA "DEF"
## [2,] "" NA "DEF" "A" "" "B" "Q" NA
## [3,] "B" "Q" NA "" NA "DEF" "A" ""
## [4,] "DEF" "A" NA "B" "Q" NA "" NA
## [5,] NA "" NA "DEF" "A" "" "B" "Q"
a.df <- data.frame(matrix(rep(c("A", "", "B", "DEF", NA, "", NA, "Q"), 5), nrow = 5, ncol = 8))
a.df[a.df[,3] == "", 3] <- NA
a.df
## Error in `[<-.data.frame`(`*tmp*`, a.df[, 3] == "", 3, value = NA) :
## missing values are not allowed in subscripted assignments of data
frames

## Enter a frame number, or 0 to exit

## 1: `[<-`(`*tmp*`, a.df[, 3] == "", 3, value = NA)
## 2: `[<-.data.frame`(`*tmp*`, a.df[, 3] == "", 3, value = NA)
## remove plain text non-codes from codes.df
sessionInfo()
## R version 2.10.0 Patched (2009-10-27 r50222)
## x86_64-unknown-linux-gnu

## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

## attached base packages:
## [1] stats graphics grDevices datasets utils methods base

## loaded via a namespace (and not attached):
## [1] tools_2.10.0
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work, & Mobile & VoiceMail (317) 399-1219 Skype No Voicemail please

        [[alternative HTML version deleted]]



R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel Received on Thu 03 Dec 2009 - 20:36:08 GMT

This archive was generated by hypermail 2.2.0 : Tue 22 Dec 2009 - 12:31:13 GMT