[R] Tabulating Sparse Contingency Table

From: <born.to.b.wyld_at_gmail.com>
Date: Fri, 28 Mar 2008 19:16:15 -0500


I have a sparse contingency table (most cells are 0):

> xtabs(~.,data[,idx:(idx+4)])
, , x3 = 1, x4 = 1, x5 = 1

   x2
x1 1 2 3
  1 0 0 31
  2 0 0 112
  3 0 0 94

, , x3 = 2, x4 = 1, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 1, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 2, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 2, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 18 0
  3 0 27 0

, , x3 = 3, x4 = 2, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 3, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 3, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 3, x5 = 1

   x2
x1 1 2 3
  1 0 0 0
  2 1 0 0
  3 2 0 0

, , x3 = 1, x4 = 1, x5 = 2

   x2
x1 1 2 3
  1 0 0 142
  2 0 0 340
  3 0 0 1

, , x3 = 2, x4 = 1, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 1, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 2, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 2, x5 = 2

   x2
x1 1 2 3
  1 0 4 0
  2 0 41 0
  3 0 0 0

, , x3 = 3, x4 = 2, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 3, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 3, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 3, x5 = 2

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 1, x5 = 3

   x2
x1 1 2 3
  1 0 0 173
  2 0 0 4
  3 0 0 0

, , x3 = 2, x4 = 1, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 1, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 2, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 2, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 2, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 1, x4 = 3, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 2, x4 = 3, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

, , x3 = 3, x4 = 3, x5 = 3

   x2
x1 1 2 3
  1 0 0 0
  2 0 0 0
  3 0 0 0

Now, I do can do the following to get the sparse representation 'y' for the table above:

> idx<-2
> y<-as.data.frame.table(xtabs(~.,data[,idx:(idx+4)]))
> y<-y[y$Freq>0,]
> z<-sort(y$Freq,decreasing=T,index.return=T)
> y<-y[z$ix,]
> y

    x1 x2 x3 x4 x5 Freq
89 2 3 1 1 2 340
169 1 3 1 1 3 173
88 1 3 1 1 2 142
8 2 3 1 1 1 112
9 3 3 1 1 1 94
122 2 2 2 2 2 41
7 1 3 1 1 1 31
42 3 2 2 2 1 27
41 2 2 2 2 1 18
121 1 2 2 2 2 4
170 2 3 1 1 3 4
75 3 1 3 3 1 2
74 2 1 3 3 1 1
90 3 3 1 1 2 1

I am wondering if there is an R function, or a simple R routine which would help me make the data frame 'y' without using 'xtabs'. I need to study contingency tables of 20 (or even more) dimensions. R is unable to store a full 3^20 contingency table. But since the tables of interest are highly sparse, I figure the problem at hand could be highly simplified if I have something that would create a sparse representation.

Any help or suggestions would be greatly appreciated.

Thanks,
A

        [[alternative HTML version deleted]]



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Sat 29 Mar 2008 - 00:19:36 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 29 Mar 2008 - 09:30:25 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive