From: Prof Brian Ripley

Date: Thu 16 Sep 2004 - 21:57:41 EST

On Thu, 16 Sep 2004, Mayeul KAUFFMANN claimed:

> ?cor says it accepts data.frame. In fact, it does iff they have no (or

It actually says

x: a numeric vector, matrix or data frame. ^^^^^^^

If you want to do the conversions as you say, you should be calling data.matrix.

On Thu, 16 Sep 2004, Mayeul KAUFFMANN wrote:

> Thanks all for your answers.

*>
**> #The difference between the 2 following commands might be a puzzle even
**> for intermediate users. (I give explanation below)
**> > cor(x[,4],x[,5])
**> [1] -0.4352342
**> > cor(x[,4:5])
**> Error in cor(x[, 4:5]) : missing observations in cov/cor
**> In addition: Warning message:
**> NAs introduced by coercion
**>
**> From: "Martin Maechler" <maechler@stat.math.ethz.ch>
**> To: "Mayeul KAUFFMANN" <mayeul.kauffmann@tiscali.fr>
**> > Mayeul> #I found the obvious workaround:
**> > Mayeul> COR <- matrix(rep(0, 81),9,9)
**> > Mayeul> for (i in 1:9) for (j in 1:9) {if (i>j) COR[i,j] <- cor
**> (x[,i],x[,j])}
**> > Mayeul> #which works fine, with no warning
**> > Mayeul> #looks like a "cor()" bug.
**> Martin Maechler wrote:
**> > quite improbably.
**> if it is wrong, can you say what is wrong then propose an alternate
**> workaround? (or should I ask on r-help).
**>
**>
**> > What does
**> > sapply(x, function(u)all(is.finite(u)))
**> > return ?
**>
**> sapply(x2, function(u)all(is.finite(u)))
**> jntdem smldepnp lrgdepnp contigkb logdstab majdyds alliesr lncaprt
**> GATT
**> TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
**> TRUE
**>
**> _______________________________________________
**>
**> But I now got the explanation. It is not due to size.
**> #Tony Plate wrote:
**> #I would suspect that your dataframe has columns that result in NA's when
**> it
**> #is coerced to a matrix
**>
**> That's not yet the explanation, but you are close to it.
**>
**> All columns are numerics, except 3 that are logical (I thought they would
**> be coerced to 0 an 1, which they are with cor(x[,4],x[,5]) not with
**> cor(x[,4:5]) )
**> They do not changes to NA's or infinite values, they ALL change to TEXT
**>
**> ?as.matrix
**> 'as.matrix' is a generic function. The method for data frames will
**> convert any non-numeric/complex column into a character vector
**> using 'format' and so return a character matrix, except that
**> all-logical data frames will be coerced to a logical matrix.
**>
**> > as.matrix(x[1:3,1:9])
**> jntdem smldepnp lrgdepnp contigkb logdstab majdyds alliesr
**> 1 "400" "0.01420874" "0.2156945" "TRUE" "5.820108" "TRUE" "TRUE"
**> 2 "400" "0.01534535" "0.2496879" "TRUE" "5.820108" "TRUE" "TRUE"
**> 3 "400" "0.01585586" "0.2570493" "TRUE" "5.820108" "TRUE" "TRUE"
**> lncaprt GATT
**> 1 "2.883204" "1"
**> 2 "2.906521" "1"
**> 3 "2.833357" "1"
**>
**> ?cor says it accepts data.frame. In fact, it does iff they have no (or
**> only: cor(x[,6:7]) works) logical columns.
**> doing cor with a logical (a dummy variable) and a numeric is maybe not as
**> sensible as doing it with 2 numerics.
**> But it may still usefull to explore data.
**>
**> Maybe one may want either to change the documentation of ?cor , or not
**> rely on as.matrix to convert the data.frame if some columns are logical.
**>
**>
**> Cheers,
**> Mayeul
**>
*

-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595

*
