From: Petr PIKAL <petr.pikal_at_precheza.cz>

Date: Fri, 7 Dec 2007 08:46:20 +0100

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Dec 2007 - 07:50:38 GMT

Date: Fri, 7 Dec 2007 08:46:20 +0100

Hi

Well, R does exactly what it says. From help page.

"Otherwise, x and y must be vectors or factors of the same length"

I do not know SAS but I presume that

*> tables bloodtype*state
*

gives you something like

tab <- table(bloodtype, state)

and

chisq.test(tab)

shall give you the expected result. You can also do directly chisq.test(bloodtype, state). But what you cannot do is to test vectors unequal **lengths**, and that is what he did. I beleve that you can not do it in SAS either.

x<-sample(letters[1:3], 10, replace=T)
x

[1] "c" "a" "c" "c" "a" "c" "a" "c" "a" "a"
y<-sample(1:5, 20, replace=T)

*> y
*

[1] 2 5 1 1 2 5 2 3 1 5 5 5 1 5 5 3 2 2 5 1

> chisq.test(x,y)

Error in chisq.test(x, y) : 'x' and 'y' must have the same length
x<-sample(letters[1:3], 20, replace=T)

Pearson's Chi-squared test

data: x and y

X-squared = 4.7937, df = 6, p-value = 0.5705

Warning message:

In chisq.test(x, y) : Chi-squared approximation may be incorrect

*>
*

Regards

Petr

r-help-bounces_at_r-project.org napsal dne 06.12.2007 23:09:24:

*>
*

> The chi-square does not need your two categorical variables to have

equal

> levels, nor limitation for the number of levels.

*>
**> The Chi-square procedure is as follow:
**> χ^2=∑_(All Cells)▒〖(Observed-Expected)〗^2/Expected
**>
**> Expected Cell= E_ij=n((i^th RowTotal)/n)((j^th RowTotal)/n)
**>
**> Degree of Freedom=df= (row-1)(Col-1)
**>
**> This way should not give you any errors if your calculations are all
**> correct. I usually use SAS for calculations like this. Below is a sample
**> code I wrote to test whether US_State and Blood type are independent.
*

You

> can modify it for your data and should give you no error.

*>
**> data bloodtype;
**> input bloodtype$ state$ count@@;
**> datalines;
**> A FL 122 B FL 117
**> AB FL 19 O FL 244
**> A IA 1781 B IA 351
**> AB IA 289 O IA 3301
**> A MO 353 B MO 269
**> AB MO 60 O MO 713
**> ;
**> proc freq data=bloodtype;
**> tables bloodtype*state
**> / cellchi2 chisq expected norow nocol nopercent;
**> weight count;
**> quit;
**>
**>
**> Best
**> Ramin
**> Gainesville
**>
**>
**>
**> Shoaaib Mehmood wrote:
**> >
**> > hi,
**> >
**> > is there a way of calculating of measuring dependence between two
**> > categorical variables. i tried using the chi square test to test for
**> > independence but i got error saying that the lengths of the two
**> > vectors don't match. Suppose X and Y are two factors. X has 5 levels
**> > and Y has 7 levels. This is what i tried doing
**> >
**> >>temp<-chisq.test(x,y)
**> >
**> > but got error "the lengths of the two vectors don't match". any help
**> > will be appreciated
**> > --
**> > Regards,
**> > Rana Shoaaib Mehmood
**> >
**> > ______________________________________________
**> > R-help_at_r-project.org mailing list
**> > https://stat.ethz.ch/mailman/listinfo/r-help
**> > PLEASE do read the posting guide
**> > http://www.R-project.org/posting-guide.html
**> > and provide commented, minimal, self-contained, reproducible code.
**> >
**> >
**>
**> --
**> View this message in context:
*

http://www.nabble.com/testing-independence-of-

> categorical-variables-tf4855773.html#a14202348

*> Sent from the R help mailing list archive at Nabble.com.
**>
**> ______________________________________________
**> R-help_at_r-project.org mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide
*

http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.

R-help_at_r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Received on Fri 07 Dec 2007 - 07:50:38 GMT

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.2.0, at Fri 07 Dec 2007 - 08:30:17 GMT.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*