[R] (no subject)

From: John Morrow <john_at_emiliem.com>
Date: Sat 29 Jul 2006 - 08:02:38 EST


Dear R-Helpers,

I have a large data matrix (9707 rows, 60 columns), which contains missing data. The matrix looks something like this:

  1. X X X X X X NA X X X X X X X X X
  2. NA NA NA NA X NA NA NA X NA NA
  3. NA NA X NA NA NA NA NA NA NA
  4. NA X NA X X X NA X X X X NA X

..

9708) X NA NA X NA NA X X NA NA X

.and so on. Notice that every row has a varying number of entries, all rows
have at least one entry, but some rows have too much missing data. My goal is to filter out/remove rows that have ~5 (this number is yet to be determined, but let's say its 5) missing entries before I run pearsons to tell me correlation between all of the rows. The order of the columns does not matter here.
I think that I might need to test each row for a "data, at least one NA, data" pattern?

Is there some kind of way of doing this? I am at a loss for an easy way to accomplishing this. Any suggestions are most appreciated!

John Morrow  

        [[alternative HTML version deleted]]



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Jul 29 18:08:18 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 30 Jul 2006 - 00:16:30 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.