Re: [R] Contingency tables from data.frames

From: Gabor Grothendieck <ggrothendieck_at_gmail.com>
Date: Wed 25 May 2005 - 08:21:25 EST

On 5/24/05, Jose Claudio Faria <joseclaudio.faria@terra.com.br> wrote:
> Dear list,
>
> I'm trying to do a set of generic functions do make contingency tables from
> data.frames. It is just running "nice" (I'm learning R), but I think it can be
> better.
>
> I would like to filter the data.frame, i.e, eliminate all not numeric variables.
> And I don't know how to make it: please, help me.
>
> Below one of the my functions ('er' is a mention to EasieR, because I'm trying
> to do a package for myself and the my students):
>
> #2. Tables from data.frames
> #2.1---er.table.df.br (User define breaks and right)------------
> er.table.df.br <- function(df,
> breaks = c('Sturges', 'Scott', 'FD'),
> right = FALSE) {
>
> if (is.data.frame(df) != 'TRUE')
> stop('need "data.frame" data')
>
> dim_df <- dim(df)
>
> tmpList <- list()
>
> for (i in 1:dim_df[2]) {
>
> x <- as.matrix(df[ ,i])
> x <- na.omit(x)
>
> k <- switch(breaks[1],
> 'Sturges' = nclass.Sturges(x),
> 'Scott' = nclass.scott(x),
> 'FD' = nclass.FD(x),
> stop("'breaks' must be 'Sturges', 'Scott' or 'FD'"))
>
> tmp <- range(x)
> classIni <- tmp[1] - tmp[2]/100
> classEnd <- tmp[2] + tmp[2]/100
> R <- classEnd-classIni
> h <- R/k
>
> # Absolut frequency
> f <- table(cut(x, br = seq(classIni, classEnd, h), right = right))
>
> # Relative frequency
> fr <- f/length(x)
>
> # Relative frequency, %
> frP <- 100*(f/length(x))
>
> # Cumulative frequency
> fac <- cumsum(f)
>
> # Cumulative frequency, %
> facP <- 100*(cumsum(f/length(x)))
>
> fi <- round(f, 2)
> fr <- round(as.numeric(fr), 2)
> frP <- round(as.numeric(frP), 2)
> fac <- round(as.numeric(fac), 2)
> facP <- round(as.numeric(facP),2)
>
> # Table
> res <- data.frame(fi, fr, frP, fac, facP)
> names(res) <- c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
> tmpList <- c(tmpList, list(res))
> }
> names(tmpList) <- names(df)
> return(tmpList)
> }
>
> To try the function:
>
> #a) runing nice
> y1=rnorm(100, 10, 1)
> y2=rnorm(100, 58, 4)
> y3=rnorm(100, 500, 10)
> mydf=data.frame(y1, y2, y3)
> #tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F)
> #tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
> tbdf=er.table.df.br (mydf, breaks = 'FD', right=F)
> print(tbdf)
>
>
> #b) One of the problems
> y1=rnorm(100, 10, 1)
> y2=rnorm(100, 58, 4)
> y3=rnorm(100, 500, 10)
> y4=rep(letters[1:10], 10)
> mydf=data.frame(y1, y2, y3, y4)
> tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
> print(tbdf)
>

Try this:

sapply(my.data.frame, is.numeric)

Also you might want to look up:

?match.arg
?stopifnot
?ncol
?sapply
?lapply

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Wed May 25 08:34:50 2005

This archive was generated by hypermail 2.1.8 : Fri 03 Mar 2006 - 03:32:03 EST