Re: [Rd] median and data frames

From: William Dunlap <>
Date: Fri, 29 Apr 2011 09:09:14 -0700

> From:
> [] On Behalf Of Martin Maechler
> Sent: Friday, April 29, 2011 7:25 AM
> To: Paul Johnson
> Cc: r-devel
> Subject: Re: [Rd] median and data frames
> [ ... lots of lines elided ... ]
> My vote is for deprecating

While R's data.frame method for mean(x) returns the same thing as colMeans(x), Splus's (since 2005) returns the same thing as mean(as.matrix(x)). (Really, it calls numerical.matrix(x), which turns non-numeric columns into columns of numeric NA's). I usually favor making data.frames act more like matrices when possible (since users often conflate the two classes) and I like having all the methods of a generic function return the same sort of thing (a single value in this case).

It is often nonsensical to ask for the mean of an entire data.frame, as the columns may have different units even when they are all numeric. It does make sense when you use a tool like read.table() or S+'s importData() to import a matrix and you don't notice it is stored as a data.frame. It does make sense when you have a single-column data.frame or matrix, perhaps arising from the use of drop=FALSE when subscripting.

Bill Dunlap
Spotfire, TIBCO Software

> Martin
> ______________________________________________
> mailing list
> mailing list Received on Fri 29 Apr 2011 - 16:16:47 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Fri 29 Apr 2011 - 16:20:59 GMT.

Mailing list information is available at Please read the posting guide before posting to the list.

list of date sections of archive