Re: [R] Tetrachoric correlation in R vs. stata

From: Janet Rosenbaum <>
Date: Sat 24 Jun 2006 - 07:33:31 EST

Peter --- Thanks for pointing out the omitted information. The hazards of attempting to be brief.

In R, I am using polychor(vec1, vec2, std.err=T) and have used both the ML and 2 step estimates, which give virtually identical answers. I am explicitly using only the 632 complete cases in R to make sure missing data is handled the same way as in stata.

Here's my data:

522	54
34	22

> polychor(v1, v2, std.err=T, ML=T)

Polychoric Correlation, ML est. = 0.5172 (0.08048) Test of bivariate normality: Chisquare = 8.063e-06, df = 0, p = NaN

    Row Thresholds
    Threshold Std.Err.
  1 1.349 0.07042

    Column Thresholds
    Threshold Std.Err.
  1 1.174 0.06458
  Warning message:
  NaNs produced in: pchisq(q, df, lower.tail, log.p)

In stata, I get:

. tetrachoric t1_v19a ct1_ix17

Tetrachoric correlations (N=632)

     Variable | t1_v19a ct1_ix17

      t1_v19a |        1
     ct1_ix17 |    .6169         1

Thanks for your help.


Peter Dalgaard wrote:

> Janet Rosenbaum <> writes:

>> I hope someone here knows the answer to this since it will save me from
>> delving deep into documentation.
>> Based on 22 pairs of vectors, I have noticed that tetrachoric
>> correlation coefficients in stata are almost uniformly higher than those
>> in R, sometimes dramatically so (TCC=.61 in stata, .51 in R; .51 in
>> stata, .39 in R). Stata's estimate is higher than R's in 20 out of 22
>> computations, although the estimates always fall within the 95% CI for
>> the TCC calculated by R.
>> Do stata and R calculate TCC in dramatically different ways? Is the
>> handling of missing data perhaps different? Any thoughts?
>> Btw, I am sending this question only to the R-help list.
> A bit more information seems necessary:
> - tetrachoric correlations depend on 4 numbers, so you should be able
>   to give a direct example
> - you're not telling us how you calculate the TCC in R. This is not
>   obvious (package polycor?).


This email message is for the sole use of the intended recip...{{dropped}} mailing list PLEASE do read the posting guide! Received on Sat Jun 24 07:39:35 2006

Archive maintained by Robert King, hosted by the discipline of statistics at the University of Newcastle, Australia.
Archive generated by hypermail 2.1.8, at Sun 25 Jun 2006 - 18:12:37 EST.

Mailing list information is available at Please read the posting guide before posting to the list.