From: Michael Friendly <friendly_at_yorku.ca>
Date: Mon, 03 Dec 2007 09:27:07 -0500

In using cor(data.frame), it is annoying that you have to explicitly filter out non-numeric columns, and when you don't, the error message is misleading:

> cor(iris)

Error in cor(iris) : missing observations in cov/cor In addition: Warning message:
In cor(iris) : NAs introduced by coercion

It would be nicer if stats:::cor() did the equivalent *itself* of the following for a data.frame:
> cor(iris[,sapply(iris, is.numeric)])

              Sepal.Length Sepal.Width Petal.Length Petal.Width

Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000


A change could be implemented here:

     if (is.data.frame(x))
         x <- as.matrix(x)

Second, the default, use="all" throws an error if there are any NAs. It would be nicer if the default was use="complete.cases", which would generate warnings instead. Most other statistical software is more tolerant of missing data.

> library(corrgram)
> data(auto)
> cor(auto[,sapply(auto, is.numeric)])
Error in cor(auto[, sapply(auto, is.numeric)]) :

   missing observations in cov/cor
> cor(auto[,sapply(auto, is.numeric)],use="complete")
# works; output elided


