From: Gary Collins <collins.gs_at_gmail.com>

Date: Sun 25 Jun 2006 - 19:10:07 EST

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jun 25 19:14:03 2006

Date: Sun 25 Jun 2006 - 19:10:07 EST

looking at the help page/code in STATA for tetrachoric, it says it estimates the tetrachoric correlation via the approximation suggested by Edwards & Edwards (1984), "Approximating the tetrachoric correlation", Biometrics, 40(2): 563.

(alpha (pi/4) - 1) / (alpha^(pi/4)+1), where alpha is ad/bc

i.e.

*> alpha=(522 * 22)/(34 * 54)
**> (alpha^(pi/4)-1) / (alpha^(pi/4)+1)
*

[1] 0.6168851

**HTH
**
Gary

On 25/06/06, John Fox <jfox@mcmaster.ca> wrote:

> Dear Janet,

*>
**> A good thing to do when different software gives different answers is
**> to check each against known results. I'm away from home, and don't have
**> all of the examples that I used to check polychor(), but I dug up the
**> following. The polychor() function produces output that agrees with
**> both of these sources. How does Stata do?
**>
**> > # example from Drasgow (1988), pp. 69-74 in Kotz and Johnson,
**> > # Encyclopedia of statistical sciences. Vol. 7.
**> > tab
**> [,1] [,2] [,3]
**> [1,] 58 52 1
**> [2,] 26 58 3
**> [3,] 8 12 9
**>
**> > polychor(tab, std.err=TRUE)
**>
**> Polychoric Correlation, 2-step est. = 0.42 (0.07474)
**> Test of bivariate normality: Chisquare = 11.55, df = 3, p = 0.009078
**>
**> > polychor(tab, ML=TRUE, std.err=TRUE)
**>
**> Polychoric Correlation, ML est. = 0.4191 (0.07616)
**> Test of bivariate normality: Chisquare = 11.54, df = 3, p = 0.009157
**>
**> Row Thresholds
**> Threshold Std.Err.
**> 1 -0.02988 0.08299
**> 2 1.13300 0.10630
**>
**>
**> Column Thresholds
**> Threshold Std.Err.
**> 1 -0.2422 0.08361
**> 2 1.5940 0.13720
**>
**> > tab # example from Brown (1977) Applied Statistics, 26:343-351.
**> [,1] [,2]
**> [1,] 1562 42
**> [2,] 383 94
**>
**> > polychor(tab)
**> [1] 0.595824
**> >
**>
**> Regards,
**> John
**>
**> On Fri, 23 Jun 2006 14:33:31 -0700
**> Janet Rosenbaum <jrosenba@rand.org> wrote:
**> > Peter --- Thanks for pointing out the omitted information. The
**> > hazards
**> > of attempting to be brief.
**> >
**> > In R, I am using polychor(vec1, vec2, std.err=T) and have used both
**> > the
**> > ML and 2 step estimates, which give virtually identical answers. I
**> > am
**> > explicitly using only the 632 complete cases in R to make sure
**> > missing
**> > data is handled the same way as in stata.
**> >
**> > Here's my data:
**> >
**> > 522 54
**> > 34 22
**> >
**> > > polychor(v1, v2, std.err=T, ML=T)
**> >
**> > Polychoric Correlation, ML est. = 0.5172 (0.08048)
**> > Test of bivariate normality: Chisquare = 8.063e-06, df = 0, p = NaN
**> >
**> > Row Thresholds
**> > Threshold Std.Err.
**> > 1 1.349 0.07042
**> >
**> >
**> > Column Thresholds
**> > Threshold Std.Err.
**> > 1 1.174 0.06458
**> > Warning message:
**> > NaNs produced in: pchisq(q, df, lower.tail, log.p)
**> >
**> > In stata, I get:
**> >
**> > . tetrachoric t1_v19a ct1_ix17
**> >
**> > Tetrachoric correlations (N=632)
**> >
**> > ----------------------------------
**> > Variable | t1_v19a ct1_ix17
**> > -------------+--------------------
**> > t1_v19a | 1
**> > ct1_ix17 | .6169 1
**> > ----------------------------------
**> >
**> > Thanks for your help.
**> >
**> > Janet
**> >
**> >
**> >
**> > Peter Dalgaard wrote:
**> > > Janet Rosenbaum <jrosenba@rand.org> writes:
**> > >
**> > >> I hope someone here knows the answer to this since it will save me
**> > from
**> > >> delving deep into documentation.
**> > >>
**> > >> Based on 22 pairs of vectors, I have noticed that tetrachoric
**> > >> correlation coefficients in stata are almost uniformly higher than
**> > those
**> > >> in R, sometimes dramatically so (TCC=.61 in stata, .51 in R; .51
**> > in
**> > >> stata, .39 in R). Stata's estimate is higher than R's in 20 out
**> > of 22
**> > >> computations, although the estimates always fall within the 95% CI
**> > for
**> > >> the TCC calculated by R.
**> > >>
**> > >> Do stata and R calculate TCC in dramatically different ways? Is
**> > the
**> > >> handling of missing data perhaps different? Any thoughts?
**> > >>
**> > >> Btw, I am sending this question only to the R-help list.
**> > >
**> > >
**> > > A bit more information seems necessary:
**> > >
**> > > - tetrachoric correlations depend on 4 numbers, so you should be
**> > able
**> > > to give a direct example
**> > >
**> > > - you're not telling us how you calculate the TCC in R. This is not
**> > > obvious (package polycor?).
**> > >
**> >
**> >
**> > --------------------
**> >
**> > This email message is for the sole use of the intended\ > ...{{dropped}}
**>
**> ______________________________________________
**> R-help@stat.math.ethz.ch mailing list
**> https://stat.ethz.ch/mailman/listinfo/r-help
**> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
**>
*

R-help@stat.math.ethz.ch mailing list

https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Sun Jun 25 19:14:03 2006

Archive maintained by Robert King, hosted by
the discipline of
statistics at the
University of Newcastle,
Australia.

Archive generated by hypermail 2.1.8, at Sun 25 Jun 2006 - 20:12:52 EST.

*
Mailing list information is available at https://stat.ethz.ch/mailman/listinfo/r-help.
Please read the posting
guide before posting to the list.
*