Re: [R] Tables with Graphical Representations

From: Ted Harding <Ted.Harding_at_nessie.mcc.ac.uk>
Date: Fri 01 Sep 2006 - 14:56:54 GMT


On 31-Aug-06 Sam Ferguson wrote:
> Hi useRs -
>
> I was wondering if anyone out there can tell me where to find
> R-code to do mixes of tables and graphics. I am thinking of
> something similar to this:
> http://yost.com/information-design/powerpoint-corrupts/
> or like the excel routines people are demonstrating:
> http://infosthetics.com/archives/2006/08/excel_in_cell_graphing.html
>
> My aim is to provide small graphics to illustrate numbers directly
> beside or behind their position in the table. Maybe there is a way
> to do it with lattice?
>
> Thanks for any help you may be able to provide.
> Sam Ferguson

I dare say there may be a way to do that kind of thing directy within R, and if so then the graphics experts will no doubt tell us how!

But your examples are just one kind of combined tabular/graphic layout (and somewhat similar to each other). In a more general context of combining tables of numerical results with graphic displays, it is perhaps better to think in terms of using R to produce the numerical results in the first instance, and then handing these over to software designed for general-purpose graphical/textual layout. You then have complete control, and full flexivility of design.

Indeed, in your second (Excel) example, the method of production is just a nasty kludge -- and it was a happy coincidence that the "REPT" function was available in Excel at all!

As Frank Harrell has just posted (just as I was completing this one!), you can do this sort of thing in LaTex (his example shows little histograms of the data, above each different tabular section). LaTex is an example of software which allows you to create precisely formatted graphics within precisely formatted text.

However, I'm no expert on LaTex, preferring what I've been used to for too many years, namely Unix 'troff' and its more recent GNU implementation 'groff'.

As a preliminary, you will need to get R to output a suitable data file, or a suitably composed data file with 'groff' formatting tags interspersed. The latter should not be difficult, though my own approach would be to simply take a data file of the form (for your first example as taken from your URL):

"% survival / standard error" "5 year" "10 year" "15 year" "20 year"
"Prostate" 98.8 0.4 95.2 0.9 87.1 1.7 81.3 3.0
"Thyroid" 96.0 0.8 95.8 1.2 94.0 1.6 95.4 2.1
"Testis" 94.7 1.1 94.0 1.3 91.1 1.8 88.2 2.3
[...]

(which would be very straightforward in R) and then use say 'awk' to compute 'groff' data with embedded tags (see below).

The file which I would then submit to 'groff' would look like

.ds RED "\X'ps: exec 1 0 0 setrgbcolor'
.ds GREY "\X'ps: exec 0.5 0.5 0.5 setrgbcolor'
.ds BLACK "\X'ps: exec 0 0 0 setrgbcolor'
.ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\

\*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\ '\
\Z'\
\h'\\$1p'\
\*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
'\h'0.5i'\
\v'-0.2m'\*[BLACK]
.LP
.TS
box tab(#);
c3 s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s.

\f[BMB]\s[15]Estimated survival rates by cancer site\s0\fP

.T&
l c s s s s s s s s s s s.
#\fB\s[12]% survival / standard error\s0\fP #\_
.T&
l c s s c s s c s s c s s.
#5 year#10 year#15 year#20 year
#\_#\_#\_#\_
.T&
l n l n n c n n c n n c n.
Prostate#98.8#\*[bx 35.6]#0.4#95.2#\*[bx 34.3]#0.9#87.1#\ \*[bx 31.4]#1.7#81.3#\*[bx 29.3]#3.0
Thyroid#96.0#\*[bx 34.6]#0.8#95.8#\*[bx 34.5]#1.2#94.0#\

\*[bx 33.8]#1.6#95.4#\*[bx 34.3]#2.1
Testis#94.7#\*[bx 34.1]#1.1#94.0#\*[bx 33.8]#1.3#91.1#\
\*[bx 32.8]#1.8#88.2#\*[bx 31.8]#2.3

[...]
Pancreas#4.0#\*[bx 1.4]#0.5#3.0#\*[bx 1.1]#1.5#2.7#\ \*[bx 1.0]#0.6#2.7#\*[bx 1.0]#0.8

.TE

The key here is to define a "parametrised string" which will be invoked as "\*[bx <number>]". The is the main "embedded tag".

Each box is 0.5 inch wide (36 points), and consists of a lefthand section in Red which width is 36*percent/100 points, with a rigthand section in Grey whose width is 36*(1 - percent/100) points. The height of the box is 1 em (which, in points, is the point-size of the current font), and the box has been shifted downwards slightly (0.2 2m) to align it nicely with the text. The parameter "<number>" in "\*[bx <number>]" is the value of 36*percent/100. So this can, for instance, be easily computed in an 'awk' run.

The block of "code"

.ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\
\*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\ '\

\Z'\
\h'\\$1p'\
\*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
'\h'0.5i'\
\v'-0.2m'\*[BLACK]

defines the tag "\*[bx ...]", which is responsible for drawing the graphical item ion the table wherever it is invoked. Initailly it is padded above an below with a bit of extra space ("\x...") and moved down slightly ("\v'0.2m'"), then colour changes to Red and a filled Red polygon is drawn; then the drawing point is shifted and a filled Grey polygon is drawn. Finally the colour is changed back to Black for the text part of the Table. The value of "<number>" is substituted for "\\$1" wherever this occurs in the definition of "bx".

The line ".TS" leads in to a Table definition, which ends with ".TE". The next few lines specifiy table layout (types, spacings and widths of columns, cell separator "#", etc.); and then come the data for each line of the table, in which the box tag "\*[bx ...]" occurs where needed. As indicated above, the full table data could probably be easily computed in R and can certainly be easily done in 'awk' or 'perl'.

After all that, the result is quite pleasing -- and, when I compare it with the graph shown on Sam's URL, it seems to me to represent the numbers much more accurately, as well as being visually slightly more expressive.

It would also be quite feasible to "complicate" the graphics with indications of SE etc., by adding more to the definition of \*[bx ...].

I have looked at the "LaTeX file produced by lstex.describe" for Frank Harrell's example. Granting that it has no doubt been automatically produced, it is enormous and, for practical purposes, uneditable if you want to tweak features of the display. It would be interesting to see what had to be down further back up the line to produce it; this might be, of course, much easier to tweak. On the other hand, my 'groff source' file above is compact and easily changed.

If anyone would like to look at the output I have produced by the above method (PDF file), and the full groff source file, drop me a line (I'll send them privately to Sam anyway).

Best wishes to all,
Ted.



E-Mail: (Ted Harding) <Ted.Harding@nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861
Date: 01-Sep-06                                       Time: 15:56:46
------------------------------ XFMail ------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat Sep 02 07:48:59 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 07 Sep 2006 - 07:51:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.