STRUCTURAL EQUATION MODELING WITH LATENT VARIABLES FOR DIFFUSION PROCESSES BASED ON HIGH-FREQUENCY DATA SHOGO KUSANO1AND MASAYUKI UCHIDA12

2025-05-02 0 0 2.01MB 54 页 10玖币
侵权投诉
STRUCTURAL EQUATION MODELING WITH LATENT VARIABLES FOR
DIFFUSION PROCESSES BASED ON HIGH-FREQUENCY DATA
SHOGO KUSANO 1AND MASAYUKI UCHIDA 1,2
Abstract. We consider structural equation modeling (SEM) with latent variables for diffusion processes
based on high-frequency data. We derive the quasi-likelihood estimators for parameters in the SEM. The
goodness-of-fit test based on the quasi-likelihood ratio is proposed. Furthermore, the asymptotic properties
of our proposed estimators are examined.
1. Introduction
We consider structural equation modeling (SEM) with latent variables for diffusion processes. The
stochastic process X1,t is defined as the following factor model:
X1,t = Λx1,mξm,t +δm,t,(1.1)
where mNis a model number, {X1,t}t0is a p1-dimensional observable vector process, {ξm,t}t0is a
k1,m-dimensional latent common factor vector process, {δm,t}t0is a p1-dimensional latent unique factor
vector process, Λx1,m Rp1×k1,m is a constant loading matrix, p1is not zero, p1and k1,m are fixed, and
k1,m p1. The stochastic process X2,t is defined by the factor model as follows:
X2,t = Λx2,mηm,t +εm,t,(1.2)
where {X2,t}t0is a p2-dimensional observable vector process, {ηm,t}t0is a k2,m-dimensional latent common
factor vector process, {εm,t}t0is a p2-dimensional latent unique factor vector process, Λx2,m Rp2×k2,m is
a constant loading matrix, p2is not zero, p2and k2,m are fixed, and k2,m p2. Furthermore, the relationship
between ηm,t and ξm,t is expressed as follows:
ηm,t =B0,mηm,t + Γmξm,t +ζm,t,(1.3)
where {ζm,t}t0is a k2,m-dimensional latent unique factor vector process, B0,m Rk2,m ×k2,m is a constant
loading matrix, whose diagonal elements are zero, and ΓmRk2,m×k1,m is a constant loading matrix. Assume
that {ξm,t}t0satisfies the following stochastic differential equation:
dξm,t =B1,m(ξm,t)dt+S1,mdW1,t (t[0, T ]),
ξm,0=c1,m,
1Graduate School of Engineering Science, Osaka University
2Center for Mathematical Modeling and Data Science (MMDS), Osaka University and JST CREST
Key words and phrases. Structural equation modeling; Asymptotic theory; High-frequency data; Stochastic differential
equation; Quasi-maximum likelihood estimation.
1
arXiv:2210.11677v1 [math.ST] 21 Oct 2022
2 S. KUSANO AND M. UCHIDA
where B1,m :Rk1,m Rk1,m ,S1,m Rk1,m×r1,c1,m Rk1,m and W1,t is an r1-dimensional standard Wiener
process, {δm,t}t0is defined as the following stochastic differential equation:
dδm,t =B2,m(δm,t)dt+S2,mdW2,t (t[0, T ]),
δm,0=c2,m,
where B2,m :Rp1Rp1,S2,m Rp1×r2,c2,m Rp1and W2,t is an r2-dimensional standard Wiener process,
{εm,t}t0satisfies the following stochastic differential equation:
dεm,t =B3,m(εm,t)dt+S3,mdW3,t (t[0, T ]),
εm,0=c3,m,
where B3,m :Rp2Rp2,S3,m Rp2×r3,c3,m Rp2and W3,t is an r3-dimensional standard Wiener process,
and {ζm,t}t0is defined by the stochastic differential equation as follows:
dζm,t =B4,m(ζm,t)dt+S4,mdW4,t (t[0, T ]),
ζm,0=c4,m,
where B4,m :Rk2,m Rk2,m ,S4,m Rk2,m×r4,c4,m Rk2,m and W4,t is an r4-dimensional standard Wiener
process. We assume that W1,t,W2,t,W3,t and W4,t are independent. Set Xt= (X>
1,t, X>
2,t)>.{Xtn
i}n
i=1 are
discrete observations, where tn
i=ihnand T=nhn, and p1,p2,k1,m and k2,m are independent of n.
SEM is a method that describes the relationships between latent variables that cannot be observed. SEM
has been used in various fields, e.g., behavioral science, economics, engineering, and medical science. For
example, in psychology, SEM is used to investigate the relationships between intelligence and motivation.
Note that intelligence and motivation are latent variables. oreskog [16] proposed this method by combining
path analysis and confirmatory factor analysis. For path analysis and confirmatory factor analysis, see, e.g.,
Mueller [23]. Several models have been proposed to formulate SEM. In this paper, we consider the model
defined by (1.1), (1.2) and (1.3), which is called the LInear Structural RELations (LISREL) model ( J¨oreskog
[17]). The LISREL model is one of the most well-known models in SEM and can be expressed complex
relationships between latent variables. For more information on the LISREL model, see, e.g., Everitt [10].
Note that SEM is a confirmatory analysis method rather than an exploratory analysis method. SEM is used
to specify the model from a theoretical viewpoint of each research field before conducting the analysis. This is
the difference between confirmatory analysis methods and exploratory analysis methods such as exploratory
factor analysis. In behavioral science, factor analysis for time series data has been actively studied; see,
e.g., Molenaar [22] and Pena and box [25]. Moreover, Czi´aky [8] proposed SEM for time series data called
dynamic structural equation model with latent variables (DSEM). Asparouhov et.al. [3] studied the more
general DSEM model.
Recently, we can easily obtain high-frequency data such as stock price data and life-log data (blood
pressure and EEG, etc.) thanks to the development of measuring devices, and statistical inference for
stochastic differential equations based on high-frequency data has been developed. For parametric estimation
of diffusion processes based on high-frequency data, see for example, Yoshida [28], Genon-Catalot and Jacod
[11], Kessler [18], Uchida and Yoshida [27] and references therein. In financial econometrics, the factor
model for high-frequency data has been extensively researched. In this field, parameters and the number of
SEM FOR DIFFUSION PROCESSES BASED ON HIGH-FREQUENCY DATA 3
factors are estimated by using principal component analysis for high-frequency data (A¨ıt-Sahalia and Xiu
[2]) when the factor is latent; see, e.g., A¨ıt-Sahalia and Xiu [1]. However, these studies are based on high
dimensionality. For a low-dimensional model, the estimator does not have consistency; see Bai [4]. On the
other hand, Kusano and Uchida [20] proposed classical factor analysis for diffusion processes. Their method
works well for a low-dimensional model. However, to the best of our knowledge, there have been few studies
of SEM for high-frequency data. Oud and Jansen [24] and Driver et.al. [9] considered SEM for stochastic
differential equations. Note that their model differs from the model in this paper. In the field of causal
inference, Hansen and Sokol [13] studied SEM for stochastic differential equations. However, their model is
the path analysis model, so that their method cannot describe the relationships between latent variables.
Note that these studies do not assume that the data is sampled at high-frequency. On the other hand, we
propose SEM for diffusion processes based on high-frequency data.
In this paper, we assume that the volatilities for diffusion processes and loading matrices are not time-
variant but constant to simplify the discussion. We leave for future work the discussion on the model where
the volatilities for diffusion processes and loading matrices are time-varying. Furthermore, we do not discuss
a high-dimensional case. Bai [5] studied the asymptotic properties of factor analysis based on the maximum
likelihood estimation for a high-dimension model. We expect that our quasi-likelihood method will also work
well for a high-dimension model. The investigation is future work.
The paper is organized as follows. In Section 2, notation and assumptions are introduced. In Section 3,
we study SEM for diffusion processes in the ergodic and non-ergodic cases. First, the asymptotic properties
of the realized covariance are examined. Next, we derive the quasi-likelihood estimators for parameters in
the SEM. It is shown that the estimators have good asymptotic properties. Furthermore, we propose the
goodness-of-fit test based on the quasi-likelihood ratio and investigate the asymptotic properties. In Section
4, we give examples and simulation studies to investigate the asymptotic performance of the results described
in Section 3. Section 5 is devoted to the proofs of theorems given in Section 3.
2. Notation and assumptions
For any vector v,|v|=tr vv>and v(i)is the i-th element of v, where >denotes the transpose. For
any matrix A, kAk=tr AA>, and Aij is the (i, j)-th element of A.Ipdenotes the identity matrix of
size p. Define Op×qas the p×qzero matrix. For any symmetric matrix ARp×p, vec A, vech Aand Dp
denote the vectorization of A, the half-vectorization of Aand the p2ׯpduplication matrix, respectively.
Here, vec A=Dpvech Aand ¯p=p(p+ 1)/2; see, e.g., Harville [14]. For any matrix A, the Moore-Penrose
inverse of Ais denoted by A+. If Ais a positive definite matrix, we write A > 0. For any positive
sequence un,R: [0,)×RdRis defined as |R(un, x)| ≤ unC(1 + |x|)Cfor some C > 0. Let Fn
i=
σ(W1,s, W2,s, W3,s, W4,s, s tn
i) for i= 1,···n. Let Ck
(Rd) be the space of all functions fsatisfying the
following conditions:
(i) fis continuously differentiable with respect to xRdup to order k.
(ii) fand all its derivatives are of polynomial growth in xRd, i.e., gis of polynomial growth in xRd
if g(x) = R(1, x).
4 S. KUSANO AND M. UCHIDA
Np(µ, Σ) represents the p-dimensional normal random variable with mean µRpand covariance matrix
ΣRp×p. Let χ2
rbe the random variable which has the chi-squared distribution with rdegrees of freedom.
χ2
r(α) denotes an upper αpoint of the chi-squared distribution with rdegrees of freedom, where 0 α1.
The symbols P
and d
express convergence in probability and convergence in distribution, respectively.
Let Σξξ,m =S1,mS>
1,m, Σδδ,m =S2,mS>
2,m, Σεε,m =S3,mS>
3,m, Σζζ,m =S4,mS>
4,m and Ψm=Ik2,m B0,m.
Furthermore, we make the following assumptions.
[A1] (a) There exists a constant C > 0 such that for any x, y Rk1,m ,
|B1,m(x)B1,m(y)| ≤ C|xy|.
(b) For all `0, sup
t
E|ξm,t|`<.
(c) B1,m C4
(Rk1,m ).
[A2] The diffusion process ξm,t is ergodic with its invariant measure πξm: For any πξm-integrable function
g, it holds that
1
TZT
0
g(ξm,t)dt P
Zg(x)πξm(dx)
as T→ ∞.
[B1] (a) There exists a constant C > 0 such that for any x, y Rp1,
|B2,m(x)B2,m(y)| ≤ C|xy|.
(b) For all `0, sup
t
E|δm,t|`<.
(c) B2,m C4
(Rp1).
[B2] Σδδ,m >0.
[B3] The diffusion process δm,t is ergodic with its invariant measure πδm: For any πδm-integrable function
g, it holds that
1
TZT
0
g(δm,t)dt P
Zg(x)πδm(dx)
as T→ ∞.
[C1] (a) There exists a constant C > 0 such that for any x, y Rp2,
|B3,m(x)B3,m(y)| ≤ C|xy|.
(b) For all `0, sup
t
E|εm,t|`<.
(c) B3,m C4
(Rp2).
[C2] Σεε,m >0.
[C3] The diffusion process εm,t is ergodic with its invariant measure πεm: For any πεm-integrable function
g, it holds that
1
TZT
0
g(εm,t)dt P
Zg(x)πεm(dx)
as T→ ∞.
SEM FOR DIFFUSION PROCESSES BASED ON HIGH-FREQUENCY DATA 5
[D1] (a) There exists a constant C > 0 such that for any x, y Rk2,m ,
|B4,m(x)B4,m(y)| ≤ C|xy|.
(b) For all `0, sup
t
E|ζm,t|`<.
(c) B4,m C4
(Rk2,m ).
[D2] The diffusion process ζm,t is ergodic with its invariant measure πζm: For any πζm-integrable function
g, it holds that
1
TZT
0
g(ζm,t)dt P
Zg(x)πζm(dx)
as T→ ∞.
[E] Ψmis non-singular.
[F] rank Λx1,m =k1,m.
Remark 1 [A1], [B1], [C1] and [D1] are the standard assumptions for ergodic diffusion processes. For
example, see Kessler [18]. [B2], [C2], [E] and [F] imply that Σm(θm) is non-singular. For details, see Lemma
3.
3. Main theorems
3.1. Ergodic case. In the LISREL model, we will estimate Λx1,m, Λx2,m, Γm, Ψm, Σξξ,m, Σδδ,m, Σεε,m
and Σζζ,m. Note that some of these elements are assumed to be known in order to satisfy an identifiability
condition for parameter estimation. See Remark 4 for constraints on the parameter and the identifiability
condition. Set the parameter as θmΘm, where ΘmRqmis a convex compact space. θmincludes only
unknown and non-duplicated elements of Λx1,m, Λx2,m, Γm, Ψm, Σξξ,m, Σδδ,m, Σεε,m and Σζζ,m. Define the
covariance structure as
Σm(θm) = ΣX1X1,m(θm) ΣX1X2,m(θm)
ΣX1X2,m(θm)>ΣX2X2,m(θm)!,(3.1)
where
ΣX1X1,m(θm) = Λx1,mΣξξ,mΛ>
x1,m + Σδδ,m,
ΣX1X2,m(θm) = Λx1,mΣξξ,mΓ>
mΨ1>
mΛ>
x2,m,
ΣX2X2,m(θm) = Λx2,mΨ1
mmΣξξ,mΓ>
m+ Σζζ,m1>
mΛ>
x2,m + Σεε,m.
To estimate (3.1), we use the realized covariance as follows:
QXX =1
T
n
X
i=1
(Xtn
iXtn
i1)(Xtn
iXtn
i1)>.
Let
Wm(θm)=2D+>
pm(θm)Σm(θm))D+
p.
For the realized covariance, the following theorem holds.
摘要:

STRUCTURALEQUATIONMODELINGWITHLATENTVARIABLESFORDIFFUSIONPROCESSESBASEDONHIGH-FREQUENCYDATASHOGOKUSANO1ANDMASAYUKIUCHIDA1;2Abstract.Weconsiderstructuralequationmodeling(SEM)withlatentvariablesfordi usionprocessesbasedonhigh-frequencydata.Wederivethequasi-likelihoodestimatorsforparametersintheSEM.The...

展开>> 收起<<
STRUCTURAL EQUATION MODELING WITH LATENT VARIABLES FOR DIFFUSION PROCESSES BASED ON HIGH-FREQUENCY DATA SHOGO KUSANO1AND MASAYUKI UCHIDA12.pdf

共54页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:54 页 大小:2.01MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 54
客服
关注