From: Achim Zeileis

Date: Tue 14 Jun 2005 - 03:24:32 EST

On Mon, 13 Jun 2005 11:05:46 -0600 (MDT) Jim Robison-Cox wrote:

> Dear R-help folks,

*>
**> I am seeing unexpected behaviour from the function mean
**> with option na.rm =TRUE (which is removing a whole column of a data
**> frame or matrix.
**>
**> example:
**>
**> testcase <- data.frame( x = 1:3, y = rep(NA,3))
*

In addition to what Sundar already wrote: In the code above x is numeric and y logical, hence as.matrix() will not do what you want (create a "character" matrix). Probably it is more appropriate to do

testcase <- data.frame( x = 1:3, y = as.numeric(rep(NA,3)))

hth,

Z

> mean(testcase[,1], na.rm=TRUE)

*> [1] 2
**> mean(testcase[,2], na.rm = TRUE)
**> [1] NaN
**>
**> OK, so far that seems sensible. Now I'd like to compute both means
**> at
**> once:
**>
**> lapply(testcase, mean, na.rm=T) ## this works
**> $x
**> [1] 2
**>
**> $y
**> [1] NaN
**>
**> But I thought that this would also work:
**>
**> apply(testcase, 2, mean, na.rm=T)
**> x y
**> NA NA
**> Warning messages:
**> 1: argument is not numeric or logical: returning NA in:
**> mean.default(newX[, i], ...)
**> 2: argument is not numeric or logical: returning NA in:
**> mean.default(newX[, i], ...)
**>
**> Summary:
**> If I have a data frame or a matrix where one entire column is NA's,
**> mean(x, na.rm=T) works on that column, returning NaN, but fails using
**> apply, in that apply returns NA for ALL columns.
**> lapply works fine on the data frame.
**>
**> If you wonder why I'm building data frames with columns that could
**> be
**> all missing -- they arise as output of a simulation. The fact that
**> the entire column is missing is informative in itself.
**>
**>
**> I do wonder if this is a bug.
**>
**> Thanks,
**> Jim
**>
**>
**>
*

*
