[R] Function for finding NA's

From: Tyler Rinker <tyler_rinker_at_hotmail.com>
Date: Sun, 03 Apr 2011 13:44:03 -0400

Quick question,  

I tried to find a function in available packages to find NA's for an entire data set (or single variables) and report the row of missing values (NA's for each column). I searched the typical routes through the blogs and the help manuals for 15 minutes. Rather than spend any more time searching I created my own function to do this (probably in less time than it would have taken me to find the function).  

Now I still have the same question: Is this function (NAhunter I call it) already in existence? If so please direct me (because I'm sure they've written better code more efficiently). I highly doubt I'm this first person to want to find all the missing values in a data set so I assume there is a function for it but I just didn't spend enough time looking. If there is no existing function (big if here), is this something people feel is worthwhile for me to put into a package of some sort?  

Tyler  

Here's the code:  

NAhunter<-function(dataset)
{
find.NA<-function(variable)
{
if(is.numeric(variable)){
n<-length(variable)
mean<-mean(variable, na.rm=T)
median<-median(variable, na.rm=T)
sd<-sd(variable, na.rm=T)
NAs<-is.na(variable)
total.NA<-sum(NAs)
percent.missing<-total.NA/n
descriptives<-data.frame(n,mean,median,sd,total.NA,percent.missing) rownames(descriptives)<-c(" ")
Case.Number<-1:n

Missing.Values<-ifelse(NAs>0,"Missing Value"," ")
missing.value<-data.frame(Case.Number,Missing.Values)
missing.values<-missing.value[ which(Missing.Values=='Missing Value'),]
list("NUMERIC DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF MISSING VALUES"=missing.values[,1])
}

else{
n<-length(variable)
NAs<-is.na(variable)
total.NA<-sum(NAs)
percent.missing<-total.NA/n
descriptives<-data.frame(n,total.NA,percent.missing) rownames(descriptives)<-c(" ")
Case.Number<-1:n
Missing.Values<-ifelse(NAs>0,"Missing Value"," ")
missing.value<-data.frame(Case.Number,Missing.Values)
missing.values<-missing.value[ which(Missing.Values=='Missing Value'),]
list("CATEGORICAL DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF MISSING VALUES"=missing.values[,1])
}
}

dataset<-data.frame(dataset)
options(scipen=100)
options(digits=2)
lapply(dataset,find.NA)
}

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sun 03 Apr 2011 - 17:47:30 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 03 Apr 2011 - 18:50:26 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive