Iteratively Reweighted Least Squares Method for Estimating Polyserial and Polychoric Correlation Coecients Peng Zhang1 Ben Liu1 and Jingjing Pan2

2025-05-05 0 0 1.02MB 22 页 10玖币

侵权投诉

Iteratively Reweighted Least Squares Method for Estimating

Polyserial and Polychoric Correlation Coeﬃcients

Peng Zhang ∗1, Ben Liu1, and Jingjing Pan2

1School of Mathematical Sciences, Zhejiang University, Hangzhou, China, 310027.

2Zhejiang Super Soul Artiﬁcial Intelligence Research Institute

Abstract

An iteratively reweighted least squares (IRLS) method is proposed for estimating polyserial

and polychoric correlation coeﬃcients in this paper. It iteratively calculates the slopes in a

series of weighted linear regression models ﬁtting on conditional expected values. For polyserial

correlation coeﬃcient, conditional expectations of the latent predictor is derived from the ob-

served ordinal categorical variable, and the regression coeﬃcient is obtained using weighted least

squares method. In estimating polychoric correlation coeﬃcient, conditional expectations of the

response variable and the predictor are updated in turns. Standard errors of the estimators are

obtained using the delta method based on data summaries instead of the whole data. Condi-

tional univariate normal distribution is exploited and a single integral is numerically evaluated

in the proposed algorithm, comparing to the double integral computed numerically based on

the bivariate normal distribution in the traditional maximum likelihood (ML) approaches. This

renders the new algorithm very fast in estimating both polyserial and polychoric correlation

coeﬃcients. Thorough simulation studies are conducted to compare the performances of the

proposed method with the classical ML methods. Real data analyses illustrate the advantage

of the new method in computation speed.

keyword: Iteratively reweighted least squares, Polyserial correlation, Polychoric correlation,

Tetrachoric correlation, Maximum likelihood, Linear regression.

1 Introduction

In behavioural, educational and psychological studies, it is common that the observed variables are

measured using ordinal scales. For example, Likert scale is widely used to measure responses in

surveys, allowing individuals to express how much respondents agree or disagree with a particular

statement in a ﬁve (or seven) point scale. These categorical variables can be treated as being

discretized from an underlying continuous variable for degree of agreement on the statement. There

are also many examples of quantitative variables that are discretized explicitly in social science

studies. For instance, when asking questions about sensitive or personal quantitative attributes

(income, alcohol consumption), the non-response rate may often be reduced by simply asking the

∗Corresponding author. E-mail: pengz@zju.edu.cn.

arXiv:2210.11115v1 [stat.ME] 20 Oct 2022

respondent to select one of two very broad categories(under $30K/ over $30K, etc.). When analyzing

this kind of data, a common approach is to assign integer values to each category and proceed in the

analysis as if the data had been measured on an interval scale with desired distributional properties.

The most common choice for the distribution of the latent variables is the normal distribution

because all covariances between the latent variables can be fully captured by the covariance matrix

and each of its elements can be estimated using a bivariate normal distribution separately. The

correlation in the standard bivariate normal distribution is called tetrachoric correlation based on

2×2 contingency table was suggested by Pearson [1900]. The tetrachoric correlation was generalized

to the case where the observed variables Xand Yhave rand sordinal categories by Ritchie-Scott

[1918] and Pearson and Pearson [1922] in the early 20th century, but it took over half a century

before the computationally feasible maximum likelihood procedure was proposed by Olsson [1979].

There have been two basic approaches to implementation: the so-called two-step method which

ﬁrst estimates the unknown thresholds from the marginal frequencies of the table and then ﬁnds

the maximum likelihood estimate (MLE) of ρconditional on the estimated thresholds. The second

approach is to ﬁnd the joint MLE of (ρ, a, b) from the likelihood function. The author gives the

equation system to be solved and, in addition, derives expressions for the information matrix which

can be used to obtain asymptotic standard errors for the estimates.

Let Xbe an observed ordinal variable which depends on an underlying latent continuous random

variable Z1and Yrepresent another observed continuous variable. It is assumed that the joint

distribution of Z1and Yis bivariate normal. The product moment correlation between Xand

Yis called the point polyserial correlation, while the correlation between Z1and Yis called the

polyserial correlation. The MLE of the polyserial correlation has been derived by Cox [1974]. Olsson

et al. [1982] derived the relationship between the polyserial and point polyserial correlation and

compared the MLE of polyserial correlation with a two-step estimator and with a computationally

convenient ad hoc estimator.

Another method to estimate tetrachoric and polychoric correlation coeﬃcients is a Bayesian

approach proposed by Albert [1992]. The author used a latent bivariate normal distribution to

estimate a polychoric correlation coeﬃcient from the Bayesian point of view by using the Gibbs

sampler. One attractive feature of this method is that it can be generalized in a straightforward

manner to handle a number of nonnormal latent distributions. They generalized their method to

handle bivariate lognormal and bivariate tlatent distributions in their simulations.

Chen and Choi [2009] and Choi et al. [2011] have showed that a diﬀerent form of Bayesian esti-

mation outperforms traditional maximum likelihood (ML) in a variety of settings, but their method

is restricted only to the case of the bivariate Gaussian distribution. They correctly pointed out that,

in real practice, the desirable sample sizes to obtain stable estimates for the polychoric correlation

coeﬃcient may not be available to the researcher. They claimed that due to the properties of nu-

merical procedure of ML (i.e., iterative hill-climbing method using gradients of the target function),

the ML estimation method for polychoric correlation coeﬃcients has several disadvantages such as,

local maxima, non-converged solution, an inaccurate estimation of the conﬁdence interval and so

on. Two new Bayesian estimates, maximum a posteriori (MAP) and expected a posteriori (EAP)

are introduced and compared to ML method. In their simulation study, they found evidence that

the MAP would be the estimator of choice for the polychoric correlation coeﬃcients.

Pearson correlations can be considered a less suitable method for studying the degree of associ-

ation between categorical variables for several reasons. First, from a methodological point of view

these variables would imply ordinal scales, whereas Pearson correlations assume interval measure-

ment scales. Furthermore, the only information provided by this kind of scale is the number of

subjects in each of the categories (cells) in a contingency table; if Pearson correlations are used

in this case the relationship between measures would be artiﬁcially restricted due to the restric-

tions imposed by categorization (Gilley and Uhlig [1993]), since all subjects situated in the interval

that limits each of the categories would be considered as being included in the same category and,

therefore, they would be assigned the same score with a resulting reduction in data variability.

Holgado-Tello et al. [2010] illustrated the advantages of using polychoric rather than Pearson

correlations in exploratory factor analysis(EFA) and conﬁrmatory factor analysis(CFA), taking into

account that the latter require quantitative variables measured in intervals, and that the relationship

between these variables has to be monotonic. Their results showed that the solutions obtained by

using polychoric correlations provide a more accurate reproduction of the measurement model used

to generate the data.

More recently, network research has gained substantial attention in psychological sciences, which

is called psychological networks by researchers. Psychological networks has been used in various

diﬀerent ﬁelds of psychology Epskamp et al. [2018]. The Gaussian graphical model(GGM) Lau-

ritzen [1996], in which edges can directly be interpreted as partial correlation coeﬃcients. The

GGM requires an estimate of the covariance matrix as input, for which polyserial correlation and

polychoric correlations can also be used in case the data are ordinal. However, for large network

problems, it usually needs considerably longer computational time when using ML method.

In this paper we propose a simple and fast method to estimate the polyserial correlation co-

eﬃcient and the polychoric correlation coeﬃcient. It is motivated by the fact that the Pearson’s

correlation coeﬃcient coincides with the slope of the regression line for paired standard normal

data. When one of the paired continuous data is discretized, an unbiased estimator of the slope

is derived from the generated categorical data. When both of the paired data are discretized, the

slope of the regression line, i.e. the correlation coeﬃcient of the two normal random variables, will

be obtained iteratively from a series of similar estimation procedures. The detail of the algorithm

can be found in Section 2. In Section 3 and 4, we conduct simulation studies and data analyses

to compare the proposed method with the ML method. At last, we conclude with discussions and

some works can be done in the future to improve the proposed method.

2 Iteratively Reweighted Least Squares Algorithm

Assume (Z1, Z2)T∼N2(0

0,R

R) where 0

0 = (0,0)Tand R

R=1ρ

ρ1,−1≤ρ≤1. Conditioning on

Z1,Z2|Z1∼N(ρZ1,1−ρ2). Hence

E(Z2|Z1) = ρZ1.(2.1)

This represents a simple linear regression model ﬁtting Z2on Z1and ρis the slope of the regression

line. Therefore, ρ, the Pearson correlation coeﬃcient of Z1and Z2, can be estimated from such a

linear regression model.

2.1 Polyserial correlation coeﬃcients

Consider the case where one of the paired random variables, namely Z1, is discritized into an

ordinal polychotomous variable, X, and the other is observed as a continuous variable, Y. Let

Xbe an observed ordinal variable with scategories, generated from the latent variable Z1with

X=iif ai−1< Z1≤ai, i = 1, . . . , s, where ais are thresholds with a0=−∞ and as=∞.

If Z1were observable, it would have been given from the regression line that E(Y|Z1) = ρZ1.

Taking expectation with respect to Z1,

E{E(Y|Z1)}=ρE(Z1).

It holds for every Z1such that ai−1< Z1≤ai, or correspondingly, X=i, for i= 1,2, . . . , s. That

is,

E{E(Y|Z1, ai−1< Z1≤ai)}=ρE(Z1|ai−1< Z1≤ai),

or,

E{E(Y|X=i)}=ρE(Z1|ai−1< Z1≤ai),(2.2)

for i= 1,··· , s.

Denote E(Y|X=i) by EYiand E(Z1|ai−1< Z1≤ai) by exi, equation (2.2) is a regression model

without an intercept, in which EYiis the response variable and exiis the explanatory variable, with

ρbeing the regression coeﬃcient. Because EYis have unequal variances, ρcannot be estimated with

an ordinary least squares method. However, clearly EYis are independent to each other, ρcan be

estimated with a weighted least squares method with a diagonal weight matrix.

It is easy to show that the density function of EYiis

f(y) =

Φai−ρy

√1−ρ2−Φai−1−ρy

√1−ρ2

φ(y),

where Pi= Pr(X=i). The mean and variance of EYi,µiand σ2

i, are given by

µi=ρφ(ai−1)−φ(ai)

σ2

i= 1 + ρ2ai−1φ(ai−1)−aiφ(ai)

Pi−ρ2{φ(ai−1)−φ(ai)}2

(2.3)

Let yi1, yi2, . . . , yinibe the observed response variables associated with X=iand ai−1< Z1j≤

aifor j= 1, . . . , ni, where niis the size of data with X=i,EYiis estimated by

Eyi= ¯yX=i=1

j=1

yij(2.4)

Since Z1has a truncated normal distribution with lower and upper limits ai−1and airespectively,

exiis the expected value of the truncated normal distribution, given by

exi=φ(ai−1)−φ(ai)

,(2.5)

where φ(·) is the density function of the standard normal distribution. Let CPi= Pr(X≤i) =

Φ(ai), i = 1,··· , s. Then

CPi=

j=1

Pj=

j=1{Φ(aj)−Φ(aj−1)},

then ˆai= Φ−1(ˆ

CP i), and exiin (2.5) is estimated by

ˆexi=φ(ˆai−1)−φ( ˆai)

=φ{Φ−1(ˆ

CP i−1)} − φ{Φ−1(ˆ

CP i)}

(2.6)

Let ˆ

Ex= (ˆex1,ˆex2,...,ˆexs)T,ˆ

Ey= ( ˆ

Ey1,ˆ

Ey2,..., ˆ

Eys)T, and

Σ=





ˆσ2

1/n10. . . 0

0 ˆσ2

2/n2. . . 0

.....

0 0 . . . ˆσ2

s/ns







the regression coeﬃcient is given by the weighted least squares method,

ˆρ= (ˆ

xˆ

Σ−1ˆ

Ex)−1ˆ

xˆ

Σ−1ˆ

Ey,(2.7)

which is reduced to

ˆρ=Ps

i=1 niˆσ−2

iˆexiˆ

Eyi

i=1 niˆσ−2

iˆe2

.(2.8)

While σ2

iin (2.3) depends on ρ, it can be obtained iteratively using the formula in (2.8), with the

Pearson correlation coeﬃcient as the initial value. The variance of ˆρis given by

Var(ˆρ)=(ˆ

xˆ

Σ−1ˆ

Ex)−1= (

i=1

niˆσ−2

iˆe2

xi)−1,(2.9)

and the standard error of ˆρis pVar(ˆρ).

The details of the IRLS algorithm for estimating polyserial correlation coeﬃcient are given in

the following Algorithm 1,

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

IterativelyReweightedLeastSquaresMethodforEstimatingPolyserialandPolychoricCorrelationCoecientsPengZhang*1,BenLiu1,andJingjingPan21SchoolofMathematicalSciences,ZhejiangUniversity,Hangzhou,China,310027.2ZhejiangSuperSoulArticialIntelligenceResearchInstituteAbstractAniterativelyreweightedleastsquare...

展开>> 收起<<

Iteratively Reweighted Least Squares Method for Estimating Polyserial and Polychoric Correlation Coecients Peng Zhang1 Ben Liu1 and Jingjing Pan2.pdf

共22页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Iteratively Reweighted Least Squares Method for Estimating Polyserial and Polychoric Correlation Coecients Peng Zhang1 Ben Liu1 and Jingjing Pan2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: