Re: [R] Quick question: Omitting rows and cols with certain percents of missing values

From: David Winsemius <dwinsemius_at_comcast.net>
Date: Fri, 13 May 2011 10:12:12 -0400

On May 13, 2011, at 9:42 AM, Vickie S wrote:

>
> Hi
> naive question.
> It is possible to get R command for omitting rows or cols with
> missing values present.
>
> But
> if i want to omit rows or cols with i.e . >20% missing values, I
> couldīt find any package-based command, probably because it is too
> simple for anyone to do that manually, though not for me. Can anyone
> please help me ?

?is.na

 > str(fil)
'data.frame': 8 obs. of 5 variables:

  $ X1  : int  2 3 4 5 6 NA NA 6
  $ X5  : int  6 7 NA NA NA NA NA NA
  $ X8  : int  9 NA NA NA NA NA NA NA
  $ X   : logi  NA NA NA NA NA NA ...
  $ X1.1: Factor w/ 6 levels "","2","3","5",..: 2 3 1 4 5 6 1 1
 > is.na(fil)
         X1    X5    X8    X  X1.1
[1,] FALSE FALSE FALSE TRUE FALSE
[2,] FALSE FALSE  TRUE TRUE FALSE
[3,] FALSE  TRUE  TRUE TRUE FALSE
[4,] FALSE  TRUE  TRUE TRUE FALSE
[5,] FALSE  TRUE  TRUE TRUE FALSE
[6,]  TRUE  TRUE  TRUE TRUE FALSE

[7,] TRUE TRUE TRUE TRUE FALSE
[8,] FALSE TRUE TRUE TRUE FALSE
 > str(is.na(fil))
  logi [1:8, 1:5] FALSE FALSE FALSE FALSE FALSE TRUE ...

So is.na() applied to a dataframe will return a logical matrix. You can run your tests for percentages with apply() using appropriate margin arguments to generate logical indices for selection of rows or columns.

-- 
David Winsemius, MD
West Hartford, CT

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Fri 13 May 2011 - 14:20:49 GMT

This quarter's messages: by month, or sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

All messages

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 13 May 2011 - 14:30:06 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive