Re: [R] Most common level of a factor by

From: Douglas Bates <bates_at_stat.wisc.edu>
Date: Sat, 30 Aug 2008 11:24:40 -0500

On Fri, Aug 29, 2008 at 6:46 PM, David Huffer <David.Huffer_at_csosa.gov> wrote:
> I'm looking for something along the lines of
>
> which ( table ( x ) == max ( table ( x ) ) )
>
> to find the most common level of one factor
> by several other factors. For instance, I've got

> > X <- data.frame (
> + x = factor ( sample ( c ( "A" , "B" , "C" , "D" ) , 20 , r = T ) )
> + , z1 = factor ( sample ( c ( "Before" , "After" ) , 20 , r = T ) )
> + , z2 = factor ( sample ( c ( "Red" , "Green" , "Blue" ) , 20 , r =
> T ) )
> + , z3 = factor ( sample ( 0:6 , 20 , r = T ) )
> + )
> > X
> x z1 z2 z3
> 1 D After Blue 0
> 2 D Before Green 3
> 3 A Before Red 5
> 4 C After Green 6
> 5 C Before Green 6
> 6 C Before Green 0
> 7 C Before Red 1
> 8 C Before Red 5
> 9 A Before Blue 3
> 10 A After Green 4
> 11 D After Red 3
> 12 C After Green 5
> 13 A After Red 0
> 14 B After Red 6
> 15 B Before Red 3
> 16 A Before Blue 4
> 17 B Before Blue 5
> 18 A After Blue 1
> 19 B Before Green 1
> 20 C Before Red 2
> >
> and i would like to be able to say which category of x was the
> most common for each combination of z1, z2, and z3. So, here,
> which category of x was the most common for Before,Red,0;
> Before,Red,1; ... Before,Red,6; Before,Green,0; Before,Green,1;
> ... Before,Green,6;...

> This seems simple rather as i type it out, but i havent been
> able to come up with the right approach so far. its friday night
> so maybe i should just go home and wait until monday...

A general approach is to create the interaction of the factors z1, z2, z3 then the cross-tabulation of x by this factor then apply which.max to the columns (or rows, depending on how you do the cross-tabulation) of the table. See the (I hope) enclosed.



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Received on Sat 30 Aug 2008 - 16:30:43 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Sat 30 Aug 2008 - 16:34:15 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.

list of date sections of archive