Re: [R] (no subject)

From: jim holtman <jholtman_at_gmail.com>
Date: Sat 29 Jul 2006 - 22:34:32 EST

Is this what you want?

> set.seed(1)
> x <- matrix(sample(c(1, NA), 100, TRUE), nrow=10) # creat some data
> x

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    1    1   NA    1   NA    1   NA    1    1     1
 [2,]    1    1    1   NA   NA   NA    1   NA   NA     1
 [3,]   NA   NA   NA    1   NA    1    1    1    1    NA
 [4,]   NA    1    1    1   NA    1    1    1    1    NA
 [5,]    1   NA    1   NA   NA    1   NA    1   NA    NA
 [6,]   NA    1    1   NA   NA    1    1   NA    1    NA
 [7,]   NA   NA    1   NA    1    1    1   NA   NA     1
 [8,]   NA   NA    1    1    1   NA   NA    1    1     1
 [9,] NA 1 NA NA NA NA 1 NA 1 NA [10,] 1 NA 1 1 NA 1 NA NA 1 NA
> # count number of NAs per row
> numNAs <- apply(x, 1, function(z) sum(is.na(z)))
> numNAs

 [1] 3 5 5 3 6 5 5 4 7 5
> # remove rows with more than 5 NAs
> x[!(numNAs > 5),]

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    1   NA    1   NA    1   NA    1    1     1
[2,]    1    1    1   NA   NA   NA    1   NA   NA     1
[3,]   NA   NA   NA    1   NA    1    1    1    1    NA
[4,]   NA    1    1    1   NA    1    1    1    1    NA
[5,]   NA    1    1   NA   NA    1    1   NA    1    NA
[6,]   NA   NA    1   NA    1    1    1   NA   NA     1
[7,]   NA   NA    1    1    1   NA   NA    1    1     1
[8,] 1 NA 1 1 NA 1 NA NA 1 NA
>

On 7/28/06, John Morrow <john@emiliem.com> wrote:
>
> Dear R-Helpers,
>
> I have a large data matrix (9707 rows, 60 columns), which contains missing
> data. The matrix looks something like this:
>
> 1) X X X X X X NA X X X X X X X X X
>
> 2) NA NA NA NA X NA NA NA X NA NA
>
> 3) NA NA X NA NA NA NA NA NA NA
>
> 5) NA X NA X X X NA X X X X NA X
>
> ..
>
> 9708) X NA NA X NA NA X X NA NA X
>
> .and so on. Notice that every row has a varying number of entries, all
> rows
> have at least one entry, but some rows have too much missing data. My
> goal
> is to filter out/remove rows that have ~5 (this number is yet to be
> determined, but let's say its 5) missing entries before I run pearsons to
> tell me correlation between all of the rows. The order of the columns
> does
> not matter here.
> I think that I might need to test each row for a "data, at least one NA,
> data" pattern?
>
> Is there some kind of way of doing this? I am at a loss for an easy way to
> accomplishing this. Any suggestions are most appreciated!
>
> John Morrow
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sat Jul 29 22:38:40 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 30 Jul 2006 - 02:16:58 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.