Re: [R] Hmisc summary.formula formats for binary and continuous variables

From: Joshua Wiley <jwiley.psych_at_gmail.com>
Date: Sun, 27 Mar 2011 02:44:15 -0700

I played around with this for awhile and did not get very far. I did not see any arguments in summary.formula or its print methods to reorder (happy to be corrected). Another approach I toyed with was to create a custom function to pass to summary.formula() that would itself create (something like) the desired output.

foo <- function(x) {
  n <- length(x)
  pct <- n/5
  c(FOO = paste(n, "(", round(pct, digits = 0), "%)",     sep = ''))
}
> summary(treatment ~ sex + age, fun = foo, method = "response")
treatment N=500

+-------+-----------+---+---------+
| | |N |FOO |
+-------+-----------+---+---------+

|sex    |f          |273|273(55%) |
|       |m          |227|227(45%) |

+-------+-----------+---+---------+
|age    |[36.8,46.7)|125|125(25%) |
|       |[46.7,50.0)|125|125(25%) |
|       |[50.0,53.3)|125|125(25%) |
|       |[53.3,67.5]|125|125(25%) |
+-------+-----------+---+---------+
|Overall|           |500|500(100%)|

+-------+-----------+---+---------+

However, it does not work with method = "reverse". Also, this approach would seem to require either defining a very flexible function or multiple ones for each different situation you come across. Looking at print.summary.formula.reverse, the magic seems to happen on lines 47-50:

            cs <- formatCats(stats[[i]], nam, tr, type[i], if (length(x$group.freq))

                x$group.freq
            else x$n[i], npct, pctdig, exclude1, long, prtest,
                pdig = pdig, eps = eps)

which lead me to explore formatCats(). A small tweak in the order of the paste() call on lines 25-33 (and creating a copy in of the altered version plus print.summary.formula.reverse in the global environment), got me:

print.summary.formula.reverse(summary(treatment ~ sex + age, method="reverse"))

Descriptive Statistics by treatment

+-------+--------------+--------------+

|       |Drug          |Placebo       |
|       |(N=262)       |(N=238)       |

+-------+--------------+--------------+
|sex : m| (118) 45% | (114) 48% |
+-------+--------------+--------------+
|age |46.5/50.0/53.8|46.6/49.5/52.6|
+-------+--------------+--------------+

which has the percentage info on the right side, though I did not take the time to get the parentheses moved over. Still, it seems like adding an argument that just flipped the order might not take that much work/code.

Cheers,

Josh

(Though I cannot help but wonder if in response to "I want to cross the street" I just said "we could start building a two-lane, underground tunnel with...." and someone is probably going to come along and point out the cross walk 10 feet down the street)

On Sat, Mar 26, 2011 at 11:09 PM, Kwok, Heemun <hkwok_at_emedharbor.edu> wrote:

>

> Hello,
> I am using Hmisc summary.formula, latex and Sweave to produce tables for publication.  Is it possible to change the formats for binary and continuous variables?  I would prefer to show 35 (10%) and 1.5 (1.2-1.8) rather than 10% (35) and 1.2 / 1.5 / 1.8. Here is a simple example:
>

> sex <- factor(sample(c("m","f"), 500, rep=TRUE))
> age <- rnorm(500, 50, 5)
> treatment <- factor(sample(c("Drug","Placebo"), 500, rep=TRUE))
>

> s1 <- summary(~sex + age)
> s2 <- summary(treatment ~ sex + age, method="reverse")
> print(s1); print(s2)
>

> Descriptive Statistics  (N=500)
>

> +-------+-----------------+
> |       |                 |
> +-------+-----------------+
> |sex : m|    46% (232)    |
> +-------+-----------------+
> |age    |47.22/50.31/53.37|
> +-------+-----------------+
>
>
>

> Descriptive Statistics by treatment
>

> +-------+-----------------+-----------------+
> |       |Drug             |Placebo          |
> |       |(N=257)          |(N=243)          |
> +-------+-----------------+-----------------+
> |sex : m|    47% (122)    |    45% (110)    |
> +-------+-----------------+-----------------+
> |age    |47.35/50.00/52.68|46.78/50.92/53.97|
> +-------+-----------------+-----------------+
>

> Thanks,
> Heemun
>
>

> -------------------------------------------------
> Heemun Kwok, M.D.
> Research Fellow
> Harbor-UCLA Department of Emergency Medicine
> 1000 West Carson Street, Box 21
> Torrance, CA 90509-2910
> office 310-222-3501, fax 310-212-6101
>

> ______________________________________________
> R-help_at_r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Received on Sun 27 Mar 2011 - 09:49:09 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sun 27 Mar 2011 - 12:30:24 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive