Re: [R] column-wise deletion in data-frames

From: Peter Dalgaard <p.dalgaard_at_biostat.ku.dk>
Date: Tue 19 Jul 2005 - 00:57:18 EST

Prof Brian Ripley <ripley@stats.ox.ac.uk> writes:

> On Mon, 18 Jul 2005, Peter Dalgaard wrote:
>
> > Chuck Cleland <ccleland@optonline.net> writes:
> >
> >>> data <- as.data.frame(cbind(X1,X2,X3,X4,X5))
> >>>
> >>> So only X1, X3 and X5 are vars without any NAs and there are some vars (X2 and
> >>> X4 stacked in between that have NAs). Now, how can I extract those former vars
> >>> in a new dataset or remove all those latter vars in between that have NAs
> >>> (without missing a single row)?
> >>> ...
> >>
> >> Someone else will probably suggest something more elegant, but how
> >> about this:
> >>
> >> newdata <- data[,-which(apply(data, 2, function(x){all(is.na(x))}))]
> >
> > (I think that's supposed to be any(), not all(), and which() is
> > crossing the creek to fetch water.)
> >
> > This should do it:
> >
> > data[,apply(!is.na(data),2,all)]
>
> If `data' is a data frame, apply will coerce it to a matrix.

So will is.na()...

> I would do
> something like
>
> keep <- sapply(data, function(x) all(!is.na(x)))
> data[keep]
>
> to use the list-like structure of a data frame and make the fewest
> possible copies.

I think the amount of copying is the same, but your version doesn't need to store the entire is.na(data) at once.

Nitpick: !any(is.na(x)) should be marginally faster than all(!is.na(x)).

-- 
   O__  ---- Peter Dalgaard             ุster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)                  FAX: (+45) 35327907

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Received on Tue Jul 19 01:00:56 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:33:46 EST