Re: [R] Why does "summary" show number of NAs as non-integer?

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed 01 Jun 2005 - 23:48:13 EST

On 6/1/05, Earl F. Glynn <efg@stowers-institute.org> wrote:
> "Berton Gunter" <gunter.berton@gene.com> wrote in message
> news:200505312240.j4VMepGX000203@hertz.gene.com...
> > summary() is an S3 generic that for your vector dispatches
> > summary.default(). The output of summary default has class "table" and so
> > calls print.table (print is another S3 generic). Look at the code of
> > print.table() to see how it formats the output.
>
> "Marc Schwartz" <MSchwartz@MedAnalytics.com> wrote in message
> news:1117582325.22595.175.camel@horizons.localdomain...
> > On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote:
>
> > > Why isn't the number of NA's just "2" instead of the "2.000" shown
> above?
>
> > "The same number of decimal places is used throughout a vector
>
> I'm talking about how this should be designed. The current impementation
> may be to print a vector using generic logic, but why use generic logic to
> produce a wrong solution? Shouldn't correctness be more important than using
> a generic solution?
>
> There is special logic to suppress NA's when they don't exist (see below),
> so why isn't there special logic to print the count of NAs, which MUST be an
> integer, correctly when they do exist?
>
> An integer should NOT be displayed with meaningless decimal places. Why
> would this ever be desirable? The generic solution should be dropped in
> favor of a correct solution.
>
> # Why not use special logic to show the number of NA's correctly as an
> integer?
> > set.seed(19)
> > summary( c(NA, runif(10,1,100), NaN) )
> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
> 7.771 24.850 43.040 43.940 63.540 83.830 2.000
>
> # There is already special logic to suppress NA's
> > set.seed(19)
> > summary( runif(10,1,100) )
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 7.771 24.850 43.040 43.940 63.540 83.830
>
> "2.000" and "2" do not have equivalent meaning.

Try:

R> library(Hmisc)
R> describe( c(NA, runif(10,1,100), NaN) ) c(NA, runif(10, 1, 100), NaN)

      n missing  unique    Mean     .05     .10     .25     .50     .75     .90 
     10       2      10   50.99   15.24   16.82   21.14   52.70   76.35   83.52 
    .95
  90.79
          13.65 17.17 18.12 30.18 46.21 59.19 65.36 80.01 81.90 98.06
Frequency     1     1     1     1     1     1     1     1     1     1
%            10    10    10    10    10    10    10    10    10    10

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 01 23:51:53 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:20 EST