Re: [R] cor(data.frame) infelicities

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Mon, 3 Dec 2007 09:31:45 -0500

You can calculate the Kendall rank correlation with such a matrix so you would not want to exclude factors in that case:

> cor(iris, method = "kendall")

             Sepal.Length Sepal.Width Petal.Length Petal.Width Species

Sepal.Length   1.00000000 -0.07699679    0.7185159   0.6553086  0.6704444
Sepal.Width   -0.07699679  1.00000000   -0.1859944  -0.1571257 -0.3376144
Petal.Length   0.71851593 -0.18599442    1.0000000   0.8068907  0.8229112
Petal.Width    0.65530856 -0.15712566    0.8068907   1.0000000  0.8396874
Species        0.67044444 -0.33761438    0.8229112   0.8396874  1.0000000


On Dec 3, 2007 9:27 AM, Michael Friendly <friendly_at_yorku.ca> wrote:
> In using cor(data.frame), it is annoying that you have to explicitly
> filter out non-numeric columns, and when you don't, the error message
> is misleading:
>
> > cor(iris)
> Error in cor(iris) : missing observations in cov/cor
> In addition: Warning message:
> In cor(iris) : NAs introduced by coercion
>
> It would be nicer if stats:::cor() did the equivalent *itself* of the
> following for a data.frame:
> > cor(iris[,sapply(iris, is.numeric)])
> Sepal.Length Sepal.Width Petal.Length Petal.Width
> Sepal.Length 1.0000000 -0.1175698 0.8717538 0.8179411
> Sepal.Width -0.1175698 1.0000000 -0.4284401 -0.3661259
> Petal.Length 0.8717538 -0.4284401 1.0000000 0.9628654
> Petal.Width 0.8179411 -0.3661259 0.9628654 1.0000000
> >
>
> A change could be implemented here:
> if (is.data.frame(x))
> x <- as.matrix(x)
>
> Second, the default, use="all" throws an error if there are any
> NAs. It would be nicer if the default was use="complete.cases",
> which would generate warnings instead. Most other statistical
> software is more tolerant of missing data.
>
> > library(corrgram)
> > data(auto)
> > cor(auto[,sapply(auto, is.numeric)])
> Error in cor(auto[, sapply(auto, is.numeric)]) :
> missing observations in cov/cor
> > cor(auto[,sapply(auto, is.numeric)],use="complete")
> # works; output elided
>
> -Michael
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
> Toronto, ONT M3J 1P3 CANADA
>
> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Mon 03 Dec 2007 - 14:35:25 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Mon 03 Dec 2007 - 20:30:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.