From: Gabor Grothendieck <ggrothendieck_at_gmail.com>

Date: Wed 01 Jun 2005 - 03:25:01 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 01 03:29:38 2005

Date: Wed 01 Jun 2005 - 03:25:01 EST

On 5/31/05, Marc Schwartz <MSchwartz@mn.rr.com> wrote:

> On Mon, 2005-05-30 at 23:53 -0400, Gabor Grothendieck wrote:

*> > On 5/30/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
**> > > Gabor Grothendieck wrote:
**> > > > On 5/30/05, Duncan Murdoch <murdoch@stats.uwo.ca> wrote:
**> > > >
**> > > >>Henrik Andersson wrote:
**> > > >>
**> > > >>>I have tried to get signif, round and format to display numbers like
**> > > >>>these consistently in a table, using e.g. signif(x,digits=3)
**> > > >>>
**> > > >>>17.01
**> > > >>>18.15
**> > > >>>
**> > > >>>I want
**> > > >>>
**> > > >>>17.0
**> > > >>>18.2
**> > > >>>
**> > > >>>Not
**> > > >>>
**> > > >>>17
**> > > >>>18.2
**> > > >>>
**> > > >>>
**> > > >>>Why is the last digit stripped off in the case when it is zero!
**> > > >>
**> > > >>signif() changes the value; you don't want that, you want to affect how
**> > > >>a number is displayed. Use format() or formatC() instead, for example
**> > > >>
**> > > >> > x <- c(17.01, 18.15)
**> > > >> > format(x, digits=3)
**> > > >>[1] "17.0" "18.1"
**> > > >> > noquote(format(x, digits=3))
**> > > >>[1] 17.0 18.1
**> > > >>
**> > > >
**> > > >
**> > > > That works in the above context but I don't think it works generally:
**> > > >
**> > > > R> f <- head(faithful)
**> > > > R> f
**> > > > eruptions waiting
**> > > > 1 3.600 79
**> > > > 2 1.800 54
**> > > > 3 3.333 74
**> > > > 4 2.283 62
**> > > > 5 4.533 85
**> > > > 6 2.883 55
**> > > >
**> > > > R> format(f, digits = 3)
**> > > > eruptions waiting
**> > > > 1 3.60 79
**> > > > 2 1.80 54
**> > > > 3 3.33 74
**> > > > 4 2.28 62
**> > > > 5 4.53 85
**> > > > 6 2.88 55
**> > > >
**> > > > R> # this works in this case
**> > > > R> noquote(prettyNum(round(f,1), nsmall = 1))
**> > > > eruptions waiting
**> > > > [1,] 3.6 79.0
**> > > > [2,] 1.8 54.0
**> > > > [3,] 3.3 74.0
**> > > > [4,] 2.3 62.0
**> > > > [5,] 4.5 85.0
**> > > > [6,] 2.9 55.0
**> > > >
**> > > > and even that does not work in the desired way (which presumably
**> > > > is not to use exponent format) if you have some
**> > > > large enough numbers like 1e6 which it will display using
**> > > > the e notation rather than using ordinary notation.
**> > >
**> > > formatC with format="f" seems to work for me, though it assumes you're
**> > > specifying decimal places rather than significant digits. It also wants
**> > > a vector of numbers as input, not a dataframe. So the following gives
**> > > pretty flexible control over what a table will look like:
**> > >
**> > > > data.frame(eruptions = formatC(f$eruptions, digits=2, format='f'),
**> > > + waiting = formatC(f$waiting, digits=1, format='f'))
**> > > eruptions waiting
**> > > 1 1000000.11 79.0
**> > > 2 1.80 54.0
**> > > 3 3.33 74.0
**> > > 4 2.28 62.0
**> > > 5 4.53 85.0
**> > > 6 2.88 55.0
**> > >
**> > > >
**> > > > I have struggled with this myself and have generally been able
**> > > > to come up with something for specific instances but I have generally
**> > > > found it a pain to do a simple thing like format a table exactly as I want
**> > > > without undue effort. Maybe someone else has figured this out.
**> > >
**> > > I think that formatting tables properly requires some thought, and R is
**> > > no good at thinking. You can easily recognize a badly formatted table,
**> > > but it's very hard to write down rules that work in general
**> > > circumstances. It's also a matter of taste, so if I managed to write a
**> > > function that matched my taste, you would find you wanted to make changes.
**> > >
**> > > It's sort of like expecting plot(x, y) to always come up with the best
**> > > possible plot of y versus x. It's just not a reasonable expectation.
**> > > It's better to provide tools (like abline() for plots or formatC() for
**> > > tables) that allow you to tailor a plot or table to your particular needs.
**> > >
**> >
**> > Thanks. That seems to be the idiom I was missing. One thing that would
**> > be nice would be if formatC could handle data frames.
**>
**>
**> Guys, perhaps I am missing something here, but there seems to be some
**> confusion as to how the numbers are stored internally, versus how the
**> output is displayed and the meaning of "significant digits", which is
**> what I believe Henrik's original query was about.
**>
**> By default, R's printed output uses the settings from options("digits")
**> and options("scipen") to define output based upon the number of
**> significant digits, which is of course not the same as the number of
**> decimal places. Hence the variance in the output that Henrik gets and
**> why the trailing zero is dropped.
**>
**> The use of signif() does not help here because it is still based upon
**> the number of significant digits, where the trailing zero still gets
**> dropped.
**>
**> The use of the above are "inexact" when it comes to creating formatted
**> output for a table with a consistent number of decimal places to align
**> columns of numbers.
**>
**> format() is still problematic here because it too uses the number of
**> significant digits, defaulting to options("digits").
*

Good point. It would be nice if format had an argument that allowed one to specify the number of digits after the decimal place. I think this would reduce frustrations in quickly formatting data frames.

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed Jun 01 03:29:38 2005

*
This archive was generated by hypermail 2.1.8
: Fri 03 Mar 2006 - 03:32:17 EST
*