From: Gavin Simpson <gavin.simpson_at_ucl.ac.uk>
Date: Tue, 05 Apr 2011 12:33:06 +0100

Dear List,

I'm not even sure this is an issue or not, but ?rowMeans has:


     A numeric or complex array of suitable size, or a vector if the
     result is one-dimensional.  The ‘dimnames’ (or ‘names’ for a
     vector result) are taken from the original array.

     If there are no values in a range to be summed over (after
     removing missing values with ‘na.rm = TRUE’), that component of
     the output is set to ‘0’ (‘*Sums’) or ‘NA’ (‘*Means’), consistent
     with ‘sum’ and ‘mean’.

However the output of mean() and rowMeans() is not exactly the same when all supplied values are missing.

> mean(NA, na.rm = TRUE)

[1] NaN
> mean(rep(NA, 5), na.rm = TRUE)

[1] NaN
> rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE)
[1] NA

So in one sense, the outputs are not consistent:

> is.nan(mean(rep(NA, 5), na.rm = TRUE))
[1] TRUE
> is.nan(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))
[1] FALSE but in another they are:

> is.na(mean(rep(NA, 5), na.rm = TRUE))
[1] TRUE
> is.na(rowMeans(matrix(rep(NA, 5), ncol = 5), na.rm = TRUE))
[1] TRUE I'm not familiar enough with the details to know if this even matters, but wonder if something in the documentation needs a change or tweak to clarify what is returned. As I say, in one sense the outputs are not consistent.

