[Rd] median and data frames

From: Patrick Burns <pburns_at_pburns.seanet.com>
Date: Wed, 27 Apr 2011 18:44:55 +0100


Here are some data frames:

df3.2 <- data.frame(1:3, 7:9)
df4.2 <- data.frame(1:4, 7:10)
df3.3 <- data.frame(1:3, 7:9, 10:12)
df4.3 <- data.frame(1:4, 7:10, 10:13)
df3.4 <- data.frame(1:3, 7:9, 10:12, 15:17)
df4.4 <- data.frame(1:4, 7:10, 10:13, 15:18)

Now here are some commands and their answers:

 > median(df3.2)
[1] 2 8

 > median(df4.2)
[1] 2.5 8.5
 > median(df3.3)
   NA
1 7
2 8
3 9
 > median(df4.3)
   NA
1 7
2 8
3 9
4 10
 > median(df3.4)
[1] 8 11

 > median(df4.4)
[1] 8.5 11.5

 > median(df3.2[c(1,2,3),])
[1] 2 8

 > median(df3.2[c(1,3,2),])
[1] 2 NA

Warning message:
In mean.default(X[[2L]], ...) :

   argument is not numeric or logical: returning NA

The sessionInfo is below, but it looks
to me like the present behavior started
in 2.10.0.

Sometimes it gets the right answer. I'd be grateful to hear how it does that -- I can't figure it out.

Under the current regime we can get numbers that are correct, partially correct, or sort of random (given the intention).

I claim that much better behavior would be to always get exactly one of the following:

I would think a method in analogy to
'mean.data.frame' would be a logical choice. But I'm presuming there might be an argument against that or 'median.data.frame' would already exist.

 > sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] graphics grDevices utils datasets stats methods base

other attached packages:
[1] xts_0.8-0 zoo_1.6-5

loaded via a namespace (and not attached):
[1] grid_2.13.0 lattice_0.19-23 tools_2.13.0

-- 
Patrick Burns
pburns_at_pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

______________________________________________
R-devel_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Received on Wed 27 Apr 2011 - 17:51:02 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 28 Apr 2011 - 14:10:53 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-devel. Please read the posting guide before posting to the list.

list of date sections of archive