From: Douglas Bates <bates_at_stat.wisc.edu>

Date: Sun 30 Jul 2006 - 01:09:14 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun Jul 30 01:17:40 2006

Date: Sun 30 Jul 2006 - 01:09:14 EST

On 7/29/06, jim holtman <jholtman@gmail.com> wrote:

> Is this what you want?

*>
**> > set.seed(1)
**> > x <- matrix(sample(c(1, NA), 100, TRUE), nrow=10) # creat some data
**> > x
**> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
**> [1,] 1 1 NA 1 NA 1 NA 1 1 1
**> [2,] 1 1 1 NA NA NA 1 NA NA 1
**> [3,] NA NA NA 1 NA 1 1 1 1 NA
**> [4,] NA 1 1 1 NA 1 1 1 1 NA
**> [5,] 1 NA 1 NA NA 1 NA 1 NA NA
**> [6,] NA 1 1 NA NA 1 1 NA 1 NA
**> [7,] NA NA 1 NA 1 1 1 NA NA 1
**> [8,] NA NA 1 1 1 NA NA 1 1 1
**> [9,] NA 1 NA NA NA NA 1 NA 1 NA
**> [10,] 1 NA 1 1 NA 1 NA NA 1 NA
**> > # count number of NAs per row
**> > numNAs <- apply(x, 1, function(z) sum(is.na(z)))
*

It's a minor point but on a large matrix it would be better to use

numNAs <- rowSums(is.na(z))

> > numNAs

*> [1] 3 5 5 3 6 5 5 4 7 5
**> > # remove rows with more than 5 NAs
**> > x[!(numNAs > 5),]
**> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
**> [1,] 1 1 NA 1 NA 1 NA 1 1 1
**> [2,] 1 1 1 NA NA NA 1 NA NA 1
**> [3,] NA NA NA 1 NA 1 1 1 1 NA
**> [4,] NA 1 1 1 NA 1 1 1 1 NA
**> [5,] NA 1 1 NA NA 1 1 NA 1 NA
**> [6,] NA NA 1 NA 1 1 1 NA NA 1
**> [7,] NA NA 1 1 1 NA NA 1 1 1
**> [8,] 1 NA 1 1 NA 1 NA NA 1 NA
**> >
**>
**>
**>
**> On 7/28/06, John Morrow <john@emiliem.com> wrote:
**> >
**> > Dear R-Helpers,
**> >
**> > I have a large data matrix (9707 rows, 60 columns), which contains missing
**> > data. The matrix looks something like this:
**> >
**> > 1) X X X X X X NA X X X X X X X X X
**> >
**> > 2) NA NA NA NA X NA NA NA X NA NA
**> >
**> > 3) NA NA X NA NA NA NA NA NA NA
**> >
**> > 5) NA X NA X X X NA X X X X NA X
**> >
**> > ..
**> >
**> > 9708) X NA NA X NA NA X X NA NA X
**> >
**> > .and so on. Notice that every row has a varying number of entries, all
**> > rows
**> > have at least one entry, but some rows have too much missing data. My
**> > goal
**> > is to filter out/remove rows that have ~5 (this number is yet to be
**> > determined, but let's say its 5) missing entries before I run pearsons to
**> > tell me correlation between all of the rows. The order of the columns
**> > does
**> > not matter here.
**> > I think that I might need to test each row for a "data, at least one NA,
**> > data" pattern?
**> >
**> > Is there some kind of way of doing this? I am at a loss for an easy way to
**> > accomplishing this. Any suggestions are most appreciated!
**> >
**> > John Morrow
**> >
**> >
**> >
**> >
**> > [[alternative HTML version deleted]]
**> >
**> > ______________________________________________
**> > R-help@stat.math.ethz.ch mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide
**> > http://www.R-project.org/posting-guide.html
**> > and provide commented, minimal, self-contained, reproducible code.
**> >
**>
**>
**>
**> --
**> Jim Holtman
**> Cincinnati, OH
**> +1 513 646 9390
**>
**> What is the problem you are trying to solve?
**>
**> [[alternative HTML version deleted]]
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
**> and provide commented, minimal, self-contained, reproducible code.
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun Jul 30 01:17:40 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Sun 30 Jul 2006 - 02:17:00 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*