Re: [R] R for simple stats

About this list Date view Thread view Subject view Author view Attachment view

From: Frank E Harrell Jr (fharrell@virginia.edu)
Date: Sat 29 Jun 2002 - 04:57:45 EST


Message-id: <20020628145745.076e9488.fharrell@virginia.edu>

You might also take a look at some functions in the Hmisc library, e.g.:

set.seed(1)
x <- runif(1000)
g <- factor(sample(letters[1:4],1000,T))
describe(x)

x
      n missing unique Mean .05 .10 .25 .50 .75 .90
   1000 0 1000 0.5043 0.06128 0.11650 0.26521 0.50441 0.74055 0.90252
    .95
0.95984

lowest : 0.003536 0.004208 0.004228 0.006153 0.006443
highest: 0.998321 0.998607 0.998766 0.999014 0.999439

options(digits=3)
s <- function(y) c(Mean=mean(y),Median=median(y),SD=sqrt(var(y)))
summary(x ~ g, fun=s)

x N=1000

+-------+-+----+-----+------+-----+
| | |N |Mean |Median|SD |
+-------+-+----+-----+------+-----+
|g |a| 254|0.495|0.469 |0.283|
| |b| 243|0.523|0.533 |0.294|
| |c| 249|0.495|0.481 |0.278|
| |d| 254|0.505|0.514 |0.289|
+-------+-+----+-----+------+-----+
|Overall| |1000|0.504|0.504 |0.286|
+-------+-+----+-----+------+-----+

summarize(x, g, s) # to cross-classify g -> llist(g1,g2)

  g x Median SD # x column=Mean
1 a 0.495 0.469 0.283
2 b 0.523 0.533 0.294
3 c 0.495 0.481 0.278
4 d 0.505 0.514 0.289

Frank Harrell

On Fri, 28 Jun 2002 11:21:32 -0700
Brett Magill <bmagill@earthlink.net> wrote:

> The code attached creates a function for descriptives statistics called
> dstats. Enter the name of the column you want to summarize and dstats will
> produce a nice summary. If you have a data frame of numeric variables and
> want to summarize by column, you can use something like:
>
> apply(data.frame.name,2,dstats)
>
> wrap t( ) around the above to get the output in a format that I find more
> useable.
>
> Brett
>
>
>
> dstats<-function(x,na.rm=T,digits=3) {
>
> dstats<-NULL
>
> dstats[1]<-mean(x,na.rm=na.rm)
> dstats[2]<-sd(x,na.rm=na.rm)
> dstats[3]<-var(x,na.rm=na.rm)
> dstats[4]<-min(x,na.rm=na.rm)
> dstats[5]<-max(x,na.rm=na.rm)
> dstats[6]<-length(unique(x))
> dstats[7]<-sum(!is.na(x))
> dstats[8]<-sum(is.na(x))
>
> dstats<-round(dstats,digits=digits)
> names(dstats)<-c("mean","sd","variance","min","max","unique","n","miss")
>
> return(dstats)
> }
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-- 
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._


About this list Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.3 : Wed 16 Oct 2002 - 11:57:34 EST