Re: [R] String frequencies in rows

From: Liaw, Andy <andy_liaw_at_merck.com>
Date: Thu 27 Jul 2006 - 01:18:23 EST


It's usually faster to operate on columns of data frames, rather than rows, so the following might help:

R> x
  G1 G2 G3 G4
1 AA BB AB AB
2 BB AB AB AA
3 AC CC AC AA
4 BB BB BB BB
R> xt <- as.data.frame(t(x))
R> sapply(xt, table)
$`1`

AA AB BB
 1 2 1

$`2`

AA AB BB
 1 2 1

$`3`

AA AC CC
 1 2 1

$`4`

BB
 4

Andy

From: Mario Falchi
>
> Hi All,
>
> I’m trying to evaluate the frequency of different strings
> in each row of a data.frame :
> INPUT:
> ID G1 G2 G3 G4 … GN
> 1 AA BB AB AB …
> 2 BB AB AB AA …
> 3 AC CC AC AA …
> 4 BB BB BB BB…
>
> The number of different strings can vary in each row.
>
> My solution has been:
> for (i in 1:length(INPUT[,1])){
> b=as.data.frame(table(t((INPUT[i,2:5]))))
> <some operations using the string values and frequencies>
> (e.g. b for i==1 is:
> AA 1
> BB 1
> AB 2 )
> }
>
> However my dataframe contains thousands rows and this script
> takes a lot of time.
> Could someone suggest me a faster way?
>
> Thank you very much,
> Mario Falchi
> [[alternative HTML version deleted]]
>
>



R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu Jul 27 01:28:27 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Thu 27 Jul 2006 - 02:16:49 EST.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.