respondent to select one of two very broad categories(under $30K/ over $30K, etc.). When analyzing
this kind of data, a common approach is to assign integer values to each category and proceed in the
analysis as if the data had been measured on an interval scale with desired distributional properties.
The most common choice for the distribution of the latent variables is the normal distribution
because all covariances between the latent variables can be fully captured by the covariance matrix
and each of its elements can be estimated using a bivariate normal distribution separately. The
correlation in the standard bivariate normal distribution is called tetrachoric correlation based on
2×2 contingency table was suggested by Pearson [1900]. The tetrachoric correlation was generalized
to the case where the observed variables Xand Yhave rand sordinal categories by Ritchie-Scott
[1918] and Pearson and Pearson [1922] in the early 20th century, but it took over half a century
before the computationally feasible maximum likelihood procedure was proposed by Olsson [1979].
There have been two basic approaches to implementation: the so-called two-step method which
first estimates the unknown thresholds from the marginal frequencies of the table and then finds
the maximum likelihood estimate (MLE) of ρconditional on the estimated thresholds. The second
approach is to find the joint MLE of (ρ, a, b) from the likelihood function. The author gives the
equation system to be solved and, in addition, derives expressions for the information matrix which
can be used to obtain asymptotic standard errors for the estimates.
Let Xbe an observed ordinal variable which depends on an underlying latent continuous random
variable Z1and Yrepresent another observed continuous variable. It is assumed that the joint
distribution of Z1and Yis bivariate normal. The product moment correlation between Xand
Yis called the point polyserial correlation, while the correlation between Z1and Yis called the
polyserial correlation. The MLE of the polyserial correlation has been derived by Cox [1974]. Olsson
et al. [1982] derived the relationship between the polyserial and point polyserial correlation and
compared the MLE of polyserial correlation with a two-step estimator and with a computationally
convenient ad hoc estimator.
Another method to estimate tetrachoric and polychoric correlation coefficients is a Bayesian
approach proposed by Albert [1992]. The author used a latent bivariate normal distribution to
estimate a polychoric correlation coefficient from the Bayesian point of view by using the Gibbs
sampler. One attractive feature of this method is that it can be generalized in a straightforward
manner to handle a number of nonnormal latent distributions. They generalized their method to
handle bivariate lognormal and bivariate tlatent distributions in their simulations.
Chen and Choi [2009] and Choi et al. [2011] have showed that a different form of Bayesian esti-
mation outperforms traditional maximum likelihood (ML) in a variety of settings, but their method
is restricted only to the case of the bivariate Gaussian distribution. They correctly pointed out that,
in real practice, the desirable sample sizes to obtain stable estimates for the polychoric correlation
coefficient may not be available to the researcher. They claimed that due to the properties of nu-
merical procedure of ML (i.e., iterative hill-climbing method using gradients of the target function),
the ML estimation method for polychoric correlation coefficients has several disadvantages such as,
local maxima, non-converged solution, an inaccurate estimation of the confidence interval and so
on. Two new Bayesian estimates, maximum a posteriori (MAP) and expected a posteriori (EAP)
are introduced and compared to ML method. In their simulation study, they found evidence that
the MAP would be the estimator of choice for the polychoric correlation coefficients.
Pearson correlations can be considered a less suitable method for studying the degree of associ-
ation between categorical variables for several reasons. First, from a methodological point of view
these variables would imply ordinal scales, whereas Pearson correlations assume interval measure-
2