Re: [R] data-management: Rowwise NA

From: Marc Schwartz <marc_schwartz_at_me.com>
Date: Thu, 03 Jun 2010 14:45:58 -0500

On Jun 3, 2010, at 2:20 PM, moleps wrote:

> Dear RŽers..
>
> In this mock dataset how can I generate a logical variable based on whether just tes or tes3 are NA in each row??
>
> test<-sample(c("A",NA,"B"),100,replace=T)
> test2<-sample(c("A",NA,"B"),100,replace=T)
> test3<-sample(c("A",NA,"B"),100,replace=T)
>
> tes<-cbind(test,test2,test3)
>
> sam<-c("test","test3")
> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
>
> However this just tests whether each variable is missing or not per row. IŽd like an -or- function in here that would provide one true/false per row based on whether test or tes3 are NA. I guess it would be easy to do it by subsetting in the example but I figure there is a more elegant way of doing it when -sam- contains 50 variables...

How about this:

set.seed(1)
test <- sample(c("A", NA, "B"), 100, replace = TRUE) test2 <- sample(c("A", NA, "B"), 100, replace = TRUE) test3 <- sample(c("A", NA, "B"), 100, replace = TRUE)

tes <- cbind(test, test2, test3)

> str(tes)

 chr [1:100, 1:3] "A" NA NA "B" "A" "B" "B" NA NA ...

> head(tes)

     test test2 test3

[1,] "A"  NA    "A"  
[2,] NA   NA    "A"  
[3,] NA   "A"   NA   
[4,] "B"  "B"   "A"  
[5,] "A"  NA    "A"  
[6,] "B"  "A"   NA   


sam <- c("test","test3")

> rowSums(is.na(subset(tes, select = sam))) > 0

  [1] FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
 [12] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE
 [23]  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [34]  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE
 [45]  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE
 [56] FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
 [67]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE
 [78]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
 [89] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE
[100]  TRUE


This avoids the looping involved in calling apply().

HTH, Marc Schwartz



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 03 Jun 2010 - 19:48:38 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 03 Jun 2010 - 20:10:27 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive