[R] Testing Two Categorical Variable

From: <ramin.1981_at_gmail.com>
Date: Thu, 06 Dec 2007 14:05:38 -0800


The chi-square does not need your two categorical variables to have equal levels, nor limitation for the number of levels.

The Chi-square procedure is as follow:
χ^2=∑_(All Cells)▒〖(Observed-Expected)〗^2/Expected

Expected Cell= E_ij=n((i^th RowTotal)/n)((j^th RowTotal)/n)

Degree of Freedom=df= (row-1)(Col-1)

This way should not give you any errors if your calculations are all correct. I usually use SAS for calculations like this. Below is a sample code I wrote to test whether US_State and Blood type are independent. You can modify it for your data and should give you no error.

data bloodtype;
input bloodtype$ state$ count@@;
datalines;
A FL 122 B FL 117
AB FL 19 O FL 244
A IA 1781 B IA 351
AB IA 289 O IA 3301
A MO 353 B MO 269
AB MO 60 O MO 713
;
proc freq data=bloodtype;
tables bloodtype*state
/ cellchi2 chisq expected norow nocol nopercent; weight count;
quit;

Best
Ramin



R-help_at_r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Thu 06 Dec 2007 - 22:28:05 GMT

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.2.0, at Thu 06 Dec 2007 - 22:30:17 GMT.

Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help. Please read the posting guide before posting to the list.