Hi Marc,

many thanks, that is exactly what I was looking for.

Best, Sven

- Original Nachricht ---- Von: Marc Schwartz <marc_schwartz_at_comcast.net> An: svga_at_arcor.de Datum: 29.07.2008 17:15 Betreff: Re: [R] Most often pairs of chars across grouping variable

> on 07/29/2008 09:51 AM svga@arcor.de wrote:

**> > is there a package or function to compute the frequencies of pairs of
**> > chars in a variable across a grouping variable? Eg:
**> > d <- data.frame(ID=gl(2,3), F=c("A","B","C","A","C","D"))
**> >> d
**> > ID F 1 1 A 2 1 B 3 1 C 4 2 A 5 2 C 6 2 D
**> > Now I want to summarize the frequencies of all pairs A-B, A-C, A-D,
**> > B-C, B-D, C-D across ID:
**> > A B C D A - 1 2 1 B - - 1 0 C - - - 1
**> > here, the combination A-C is most frequent. The real problem behind
**> > that is that 'F' codes diagnoses and I search for the most often
**> > pairs of diagnoses.
**> > Thanks, Sven
**> I suspect that there might be something over in Bioconductor, but here
**> > table(data.frame(t(do.call(cbind,
**> tapply(d$F, d$ID,
**> function(x) combn(as.character(x), 2))))))
**> X2
**> X1 B C D
**> A 1 2 1
**> B 0 1 0
**> C 0 0 1
**> See ?combn to create the initial pairs from the data. This is done on a
**> per ID basis using tapply. The result is transposed into a data frame
**> and then table() is used to create the cross tabulation of the results.
**>
**> Marc Schwartz
