CLT for random quadratic forms based on sample means and sample covariance matrices

2025-04-29 0 0 535.36KB 58 页 10玖币
侵权投诉
arXiv:2210.11215v1 [math.ST] 20 Oct 2022
CLT for random quadratic forms based on
sample means and sample covariance
matrices
Wenzhi Yang1, Yiming Liu 2, Guangming Pan3and Wang Zhou 4
1School of Big Data and Statistics, Anhui University, e-mail: *wzyang@ahu.edu.cn
2School of Economics, Jinan University, e-mail: **liuy0135@e.ntu.edu.sg
3School of Physical and Mathematical Sciences, Nanyang Technological University, e-mail:
GMPAN@ntu.edu.sg
4Department of Statistics and Data Science, National University of Singapore, e-mail:
wangzhou@nus.edu.sg
Abstract: In this paper, we use dimensional reduction technique to study
the central limit theory (CLT) random quadratic forms based on sample
means and sample covariance matrices. Specifically, we use a matrix de-
noted by Up×q, to map q-dimensional sample vectors to a pdimensional
subspace, where qpor qp. Under the condition of p/n 0as
(p, n)→ ∞, we obtain the CLT of random quadratic forms for the sample
means and sample covariance matrices.
MSC2020 subject classifications:Primary 62E20.
Keywords and phrases: Random quadratic forms, Sample means, Sam-
ple covariance matrices, Central limit theory.
1. Introduction
Consider the multivariate model
yj=µ+Γxj,1jn, (1.1)
where µis a mean vector in Rq,Γis a qby mmatrix, qm,Σq=ΓΓ
is a positive definite covariance matrix (denoted by Σq0) and x1,...,xn
are independent and identically distributed (i.i.d.)m-dimensional real random
vectors with the mean vector Ex1=0mand Cov(x1) = Im. So y1,...,ynare q-
dimensional real random vectors and sample mean statistic ¯
y=1
nPn
i=1 yiand
sample covariance matrix Sn=1
nPn
i=1(yi¯
y)(yi¯
y)are very important
in the mean vector test and covariance matrix test (see Anderson [1]). Here,
represents the transpose of a vector. However, as the dimension increases, there
are problems in the mean vector test and covariance matrix test. For example,
when q > n 1, the inverse of Sndoes not exist so the Hotelling’s T2test
obtained by Hotelling[13]
T2=n(¯
yµ)S1
n(¯
yµ),(1.2)
1
/ CLT for random quadratic forms 2
fails to test the high dimensional mean. There are many papers to study the
high dimensional means and covariance matrix. For example, Bai et al.[2], Bai
and Saranadasa[3], Bai and Silverstein[5,6], Chen et al. [11], Chen and Qin[12],
Pan and Zhou[14], Srivastava [15], Srivastava and Du[16], Srivastava and Li[17],
etc.
Inspired by dimension reduction techniques such as principle component anal-
ysis, we aim to project the observations into a low dimensional subspace through
a reduction matrix Up×q(pq). The projected observations can be written in
a vector form as
zj=Uyj=Uµ+xj,1jn, (1.3)
where U=Up×qand Γ=Γq×mare nonrandom matrices and x1,...,xnare
i.i.d. m-dimensional real random vectors with the mean vector 0mand covari-
ance matrix Im. In view of (1.3), the centered sample covariance matrix is
defined by
Sn=1
n
n
X
j=1
(zj¯
z)(zj¯
z)=1
n
n
X
j=1
(xj¯
x)(xj¯
x)ΓU(1.4)
where ¯
z=n1Pn
j=1 zj=Uµ+¯
xand ¯
x=n1Pn
j=1 xj.
This is the first paper in a series of two papers. In this paper, we will study
the central limit theory (CLT) of random quadratic forms involving sample
means and sample covariance matrices when p/n 0as (p, n) but q
can be arbitrarily large. For the details, please see Theorem 1. In our second
paper, we use Theorem 1to derive the CLT Hotelling’s T2test when p/n 0.
To investigate this limit theory, we recall some basic definitions from random
matrix theory.
Let A=Ap×pbe any p×psquare matrix with real eigenvalues denoted by
λ1≥ ··· ≥ λp. The empirical spectral distribution (ESD) of Ais defined by
FA(x) = 1
p
p
X
j=1
I(λjx), x R,
where I(·)is the indicator function. The Stieltjes transform of FAis given by
mFA(z) = Z1
xzdF A(x),
where z=u+iv C+≡ {zC,I(z)>0}. Let Xn= (x1,...,xn)and
Σm=ΓΓ. The famous Marcěnko Pastur (M-P) law states that the ESD of
Sn=1
nΓXnX
nΓ, i.e., FSn(x), weakly converges to a nonrandom probability
distribution function Fc,H (x)whose Stieltjes transform is determined by the
following equation
m(z) = Z1
λ(1 cczm(z)) zdH(λ)
/ CLT for random quadratic forms 3
for each zC+as m/n c(0,)and n→ ∞, where His the limiting
spectral distribution of ΓΓ. One can refer to Bai and Silverstein [6] for more
details.
The rest of this paper is organized as follows. Section 2 presents the CLT
for random quadratic forms with dimensionality reduction, and its proof is pre-
sented in Section 3. Some auxiliary proofs are presented in Appendix A.1-A.2.
2. CLT for random quadratic forms
Assumption:
(A.1) Let Xn= (x1,...,xn) = (Xij )be an m×nmatrix whose entries are
i.i.d. real random variables with EX11 = 0, Var(X11) = 1 and EX4
11 <.
(A.2) For pqm, let U=Up×q,Γ=Γq×mand Σpbe nonrandom
matrices satisfying
UΓΓU=Σp0.
(A.3) Let cn=p/n and cn=O(nη)for some η(0,1).
In the following, we will study the limiting distributions of random quadratic
forms involving sample means and sample covariance matrices. Since Σp0,
Σ1/2
pexists. So we take the transform
˜
µ=Σ1
2
pUµ,B=Σ1
2
p,(2.1)
˜
zj=Σ1
2
pzj=Σ1
2
pUyj=˜
µ+Bxj,1jn, (2.2)
˜
Sn=1
n
n
X
j=1
(˜
zj¯
˜
z)(˜
zj¯
˜
z)=1
n
n
X
j=1
B(xj¯
x)(xj¯
x)B,(2.3)
where ¯
x=1
nPn
j=1 xjand
¯
˜
z=1
n
n
X
j=1
˜
zj=Σ1
2
pUµ+1
n
n
X
j=1
Σ1
2
pUΓxj=˜
µ+B¯
x.(2.4)
Let λ1, λ2,...,λpdenote eigenvalues of ˜
Sndefined by (2.3). For any analytic
function f(·), define
f(˜
Sn) = Vdiag(f(λ1),...,f(λp))V,
where Vdiag(λ1,...,λp)Vdenotes the spectral decomposition of ˜
Sn.
Theorem 1. Let assumptions (A.1)-(A.3) be satisfied. Assume that g(x)is
a function with a continuous first derivative in a neighborhood of 0 such that
g(0) 6= 0,f(x)is analytic on an open region containing the interval
[1 δ, 1 + δ],δ(0,1) (2.5)
/ CLT for random quadratic forms 4
and satisfies f(1) 6= 0. Denote ˜
µ=Σ1/2
pUµand
Xn=n
pcnh(¯
˜
z˜
µ)f(˜
Sn)(¯
˜
z˜
µ)
k¯
˜
z˜
µk2f(1)i, Yn=n
phg((¯
˜
z˜
µ)(¯
˜
z˜
µ))g(cn)i,
where ˜
Snand ¯
˜
zare defined by (2.3)and (2.4), respectively. As min(p, n)→ ∞,
(Xn, Yn)d
(X, Y ),(2.6)
where (X, Y )N(0,Γ1)with
Γ1=2f2(1) 2g(0)f(1)
2g(0)f(1) 2(g(0))2.
Remark 1. The conditions in (A.1) are usually used to study the random ma-
trix (see [5,6]). Condition (A.2) requests the reduced dimensional covariance
matrix to be a positive definite matrix. When q/n c(0,1), Pan and Zhou[14]
obtained the random quadratic forms based on sample means and sample covari-
ance matrices and gave its application to the Hotelling’s T2test. In this paper,
we use Up×qmatrix to map q-dimensional sample vectors to a pdimensional
subspace, where qpor qp. Under the condition of p/n 0as (p, n)→ ∞,
we obtain the CLT of random quadratic forms for the sample means and sample
covariance matrices. In our second paper, we will use Theorem 1to derive the
CLT Hotelling’s T2test when p/n 0.
3. Outline of the proofs
The proof of Theorem 1relies on Lemma 1below that deals with the asymp-
totic joint distribution of
Xn(z) = n
pcnh¯
xB(˜
SnzIp)1B¯
x
kB¯
xk2m(z)i,
Yn=n
phg(¯
xBB¯
x)g(cn)i,
where B=Σ1
2,cn=p/n,m(z) = R1
xzdH(x)and H(x) = I(1 x)for
xR. The stochastic process Xn(z)is defined on a contour Cas follows: Let
v0>0be arbitrary and set Cu={u+iv0, u [ul, ur]}, where ul= 1 δand
ur= 1 + δfor some δ(0,1). Then, we define
C+={ul+iv :v[0, v0]} ∪ Cu∪ {ur+iv :v[0, v0]}
and denote Cto be the symmetric part of C+about the real axis. Then, let
C=C+∪ C. Let
˜
Sn=1
n
n
X
j=1
Bxjx
jB.(3.1)
/ CLT for random quadratic forms 5
By random matrix theory, ˜
Sndefined by (2.3) can be replaced by ˜
Sndefined by
(3.1). It is difficult to control the spectral norm of (˜
SnzIp)1or (˜
SnzIp)1
on the whole contour C(for example v= 0), we thus define a truncated version
ˆ
Xn(z)of Xn(z)(see in Bai and Silverstein [5]). For some ϑ(0,1), we choose
a positive number sequence {ρn}satisfying
ρn0and ρnnϑ.(3.2)
Let Cl={ul+iv :v[n1ρn, v0]}and Cr={ur+iv :v[n1ρn, v0]}. Write
C+
n=ClCu Cr. Consequently, for z=u+iv ∈ C, a truncated process ˆ
Xn(z)
is defined as follows
ˆ
Xn(z) =
Xn(z),if z∈ C+
n∪ C
n;
nv +ρn
2ρn
Xn(zr1) + ρnnv
2ρn
Xn(zr2),if u=ur, v [ρn
n,ρn
n];
nv +ρn
2ρn
Xn(zl1) + ρnnv
2ρn
Xn(zl2),if u=ul>0, v [ρn
n,ρn
n].
(3.3)
Here, zr1=ur+iρn
n,zr2=uriρn
n,zl1=ul+iρn
n,zl2=uliρn
nand C
nis
the symmetric part of C+
nabout the real axis.
We now give the asymptotic joint distribution of (ˆ
Xn(z), Yn)in Lemma 1.
Lemma 1. Under the conditions of Theorem 1, for z∈ C, we have
(ˆ
Xn(z), Yn)d
(X(z), Y ),(3.4)
where (X(z), Y )N(0,Γ2),
Γ2="2
(1z)2
2g(0)
1z
2g(0)
1z2(g(0))2#.
To transfer Lemma 1to Theorem 1, we introduce a new ESD function
F˜
Sn
2(x) =
p
X
j=1
t2
jI(λjx), x R,
where t= (t1, . . . , tp)=VB¯
x/kB¯
xk,B=Σ1
2
pand Vis the eigenvector
matrix of ˜
Sndefined by (2.3) (see Bai et al.[2]). Following Theorem 1 in Bai
et al.[2] and Remark 3 in Pan and Zhou [14], one can similarly obtain that as
p/n 0,
F˜
Sn
2(x)H(x),a.s.,
where H(x) = I(1 x)for xR. Then, by analyticity of f(x),¯
xBf(˜
Sn)B¯
x
kB¯
xk2in
Theorem 1is transferred to ¯
xB(˜
SnzIp)1B¯
x
kB¯
xk2and Stieljes transform of F˜
Sn
2(x).
Let A1
n(z) = (˜
Sn+zIp)1. Note that
¯
xBA1
n(z)B¯
x
1¯
xBA1
n(z)B¯
x=¯
xB(˜
SnzIp)1B¯
x,(3.5)
摘要:

arXiv:2210.11215v1[math.ST]20Oct2022CLTforrandomquadraticformsbasedonsamplemeansandsamplecovariancematricesWenzhiYang1,YimingLiu2,GuangmingPan3andWangZhou41SchoolofBigDataandStatistics,AnhuiUniversity,e-mail:*wzyang@ahu.edu.cn2SchoolofEconomics,JinanUniversity,e-mail:**liuy0135@e.ntu.edu.sg3Schoolof...

展开>> 收起<<
CLT for random quadratic forms based on sample means and sample covariance matrices.pdf

共58页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:58 页 大小:535.36KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 58
客服
关注