ON THE TESTING OF MULTIPLE HYPOTHESIS IN SLICED INVERSE REGRESSION By Zhigen Zhaoand Xin Xing

2025-04-27 0 0 1.52MB 22 页 10玖币
侵权投诉
ON THE TESTING OF MULTIPLE HYPOTHESIS IN
SLICED INVERSE REGRESSION
By Zhigen Zhaoand Xin Xing
We consider the multiple testing of the general regression frame-
work aiming at studying the relationship between a univariate re-
sponse and a p-dimensional predictor. To test the hypothesis of the
effect of each predictor, we construct an Angular Balanced Statistic
(ABS) based on the estimator of the sliced inverse regression without
assuming a model of the conditional distribution of the response. Ac-
cording to the developed limiting distribution results in this paper,
we have shown that ABS is asymptotically symmetric with respect to
zero under the null hypothesis. We then propose a Model-free multi-
ple Testing procedure using Angular balanced statistics (MTA) and
show theoretically that the false discovery rate of this method is less
than or equal to a designated level asymptotically. Numerical evi-
dence has shown that the MTA method is much more powerful than
its alternatives, subject to the control of the false discovery rate.
Keywords: model free, FDR, sufficient dimension reduction
1. Introduction. In the general framework of the regression analysis,
the goal is to infer the relation between a univariate response variable yand
ap×1 vector x. One would like to know y|x, namely, how the distribution
of ydepends on the value of x. Among the literature of sufficient dimension
reduction [25, 8, 9, 28, 29], the fundamental idea is to replace the predictor
by its projection to a subspace without loss of information. In other words,
we seek for a subspace Sy|xof the predictor space such that
(1) y
|=
x|PSy|xx.
Here
|=
indicates independence, and P(·)stands for a projection operator.
The subspace Sy|xis called the central subspace. Let dbe the dimension
of this central subspace. Let B, a p×dmatrix, be a basis of the central
subspace Sy|x. Then the equation (1) is equivalent to
(2) y
|=
x|Bx.
Zhigen Zhao is Associate Professor of Department of Statistics, Operations, and Data
Science, Temple University.
Xin Xing is Assistant Professor of the Department of Statistics, Virginia Tech Univer-
sity.
1
arXiv:2210.05873v2 [stat.ME] 16 Jun 2023
2ZHAO AND XING
To further reduce the dimensionality especially when the number of pre-
dictors pdiverges with respect to n, it is commonly assumed that ydepends
on xthrough a subset of x, known as the Markov blanket and denoted as
MB(y, x) [33, 36, 5], such that
y
|=
x|MB(y, x).
For each predictor, one would like to know whether xj∈ MB(y, x), which
could be formulated as a multiple testing problem. The null hypothesis stat-
ing that xj/∈ MB(y, x) is equivalent to
(3) Hj:Pspan(xj)(Sy|x) = Op,
where Opis the origin point in the p-dimensional space [10]. In other words,
it is equivalent to saying that the j-th row of the matrix Bconsists of all
zeros.
There are many attempts targeting estimating the central subspace in the
existing literature on sufficient dimension reduction. The most widely used
method is the sliced inverse regression (SIR) which was first introduced in
[25]. Later, there are many attempts to extend SIR, including, but not lim-
ited to, sliced average variance estimation [14, 8], directional regression [24],
and constructive estimation [41]. Nevertheless, most of the existing methods
and theories in sufficient dimension reduction focus on the estimation of the
central subspace Sy|x. The result of statistical inference is very limited when
pdiverges, not to mention the procedure of controlling the false discovery
rate (FDR) when testing these hypotheses simultaneously.
The challenge arises from two perspectives. First, among the literature
on sufficient dimension reduction, the result on the limiting distribution of
the estimator of the central subspace is very limited when pdiverges. When
pis fixed, [20, 46, 30, 23] have derived the asymptotic distribution of the
sliced inverse regression. To the best of the authors’ knowledge, there are
no results on the limiting distribution when pdiverges unless assuming the
signal is strong and the total number of false hypotheses is fixed [40].
Second, after the test statistic is determined for each hypothesis, how to
combine these test statistics to derive a method that controls the false dis-
covery rate is challenging. Many existing procedures, such as [2, 3, 42, 35],
work on the (generalized) linear regression models. In [5], the authors con-
sidered an arbitrary joint distribution of yand xand proposed the model-X
knockoff to control FDR. However, this method requires that the distribu-
tion of the design matrix is known, which is not feasible in many practice.
The study conducted by [21] explored variable selection for the linear
regression model, assuming the condition of weak and rare signals. It is
3
noted in this paper that the selection consistency is not possible and allow-
ing for false positives is necessary. While several existing penalization-based
methods on sufficient dimension reduction require imposing uniform signal
strength condition to achieve consistent results ([27, 29, 38, 44]), this work
tackles a more challenging scenario. Specifically, we develop the central limit
theorem of SIR, utilizing recent theories on Gaussian approximation [6, 7]
without relying on the uniform signal strength conditions. This theoretical
result is the first of its kind in the literature on sufficient dimension reduc-
tion and is a necessity for simultaneous inference when the effects of some
relevant predictors are either moderate or weak.
We proceed by constructing a statistic for each hypothesis based on sliced
inverse regression. Applying Gaussian approximation theory, we demon-
strate that this statistic is asymptotically symmetric about zero when the
null hypothesis holds. We refer to this as a Angular Balanced Statistic
(ABS). We then develop a single-step procedure that rejects a hypothesis
when its ABS exceeds a certain threshold. Additionally, we provide an esti-
mator for the false discovery proportion. For a designated FDR level q, we
adaptively select a threshold such that the estimated false discovery pro-
portion is no greater than q. This method is referred to as the Model-free
multiple Testing procedure using Angular balanced statistics (MTA). The-
oretical analysis confirms that MTA asymptotically controls the FDR at
the q-level under regularity conditions. Simulation results and data analysis
demonstrate that MTA significantly outperforms its competitors in terms of
power while controlling the FDR.
The paper is organized as follows. In Section 2, we derive the central limit
theorem of SIR when both the dimension pand the number of important
predictors sdiverge with nusing the recently developed Gaussian approx-
imation theory. In Section 3, we construct ABS based on the estimator of
SIR and propose the MTA method. It is shown that the FDR of the MTA
method is less than or equal to a designated level asymptotically. In Sections
4 and 5, we provide numerical evidence including extensive simulations and
a real data analysis to demonstrate advantages of MTA. We conclude the
paper in Section 6, and include all technical details in the appendix.
Notation. We adopt the following notations throughout this paper. For
a matrix A, we call the space generated by its column vectors the column
space and denote it by col(A). The element at the i-th row and j-th column
of a matrix Ais denoted as Aij or aij. The i-th row and j-th column of the
matrix are denoted by Ai·and A·j, respectively. The minimum and max-
imum eigenvalues of Aare denoted as λmin(A) and λmax(A) respectively.
4ZHAO AND XING
For two positive numbers a, b, we use aband abto denote max{a, b}
and min{a, b}respectively. We use c,C,C,C1and C2to denote generic
absolute constants, though the actual value may vary from case to case.
2. Gauss Approximaton of SIR. Recall that yis the response and x
is a p-dimensional vector. In the literature of sufficient dimension reduction,
people aim to find the central subspace Sy|xdefined in (1). The sliced inverse
regression (SIR) introduced in [25] is the first and the most popular method
among many existing ones.
Assume that the covariance matrix of xis Σ. Let =Σ1be the pre-
cision matrix of x. Let sj= #{k:jk ̸= 0}and s= maxjsj. When the
distribution of xis elliptically symmetric, it is shown in [25] that
ΣSy|x=col(Λ),(4)
where Λ=var(E[x|y]) and col(Λ) is the column space spanned by Λ.
Given n i.i.d. samples (yi,xi), i= 1,··· , n. To estimate Λ, divide the
data into Hslices according to the order statistics y(i),i= 1, . . . , n. Let x(i)
be the concomitant associated with y(i). Note that slicing the data naturally
forms a partition of the support of the response variable, denoted as H. Let
Phbe the h-th slice in the partition H. Here we let P1= (−∞, y(n/H)] and
PH= (y(n/H⌉∗(H1)+1),+). Let ¯
xbe the mean of all the x’s and ¯
xh,·be
the sample mean of the vectors x(j)’s such that its concomitant y(j)∈ Ph
and estimate Λvar(E[x|y]) by
(5) b
ΛH=1
H
H
X
h=1
(¯
xh,·¯
x)(¯
xh,·¯
x)τ.
The b
ΛHwas shown to be a consistent estimator of Λunder some technical
conditions [18, 20, 46, 25, 28].
Alternatively, we could view SIR through a sequence of ordinary least
squares regressions. Let fh(y), h = 1,2,··· , H be a sequence of transforma-
tions of Y. Following the proof of [43, 40], one knows that under the linearity
condition [25],
Ef(y)ϕ(y)∈ Sy|x,
where ϕ(y) = Σ1E(x|y). Let βhRp×1be defined as
βh=argminβhE(fh(y)h)2.
Assuming the coverage condition [32, 13, 40], then
Span(B) = Sy|x,
5
where B= (β1,··· ,βH)Rp×H.
Note that different choices of fh(y) lead to different methods [40, 17]. To
name a few here, [43] suggested fh(y) = yhwhere hH. After slicing the
data into Hslices according to the value of the response variable y, [13]
suggested fh(y) = yif yis in the h-th slice and 0 otherwise. If we choose
fh(y) = (y∈ Ph), this will lead to SIR, which is the main focus in this
paper [43, 40].
After obtaining data (xi, yi) based on a sample of nsubjects, let
fh(y) = (y∈ Ph) = ( (y1∈ Ph),(y2∈ Ph),··· ,(yn∈ Ph))T.
Let ˆ
βhbe defined as
ˆ
βh=argminβh||fh(y)xTβh||2= (xxT)1xfh(y),
or a general form
(6) ˆ
βh=argminβh||fh(y)xTβh||2=1
nˆ
xfh(y),
where ˆ
is a suitable approximation of the inverse of the Gram matrix
ˆ
Σ=xTx/n. Let
(7) ˆ
B= ( ˆ
β1,··· ,ˆ
βH).
There are many methods to estimate the precision matrix. As an example,
we could consider the one given by the Lasso for the node-wide regression
on the design matrix x[31]. Next, we will derive the central limit theorem
of SIR when pconverges to the infinity.
Our derivation is built upon the Gaussian approximation (GAR) theory
recently developed in [6, 7]. Let PH=hPhbe a partition of the sample
space of yand ph=P(y∈ Ph). Define
(8) ˜
βh=1
nxfh(y).
Let ˜
B= ( ˜
β1,··· ,˜
βH). For i= 1,2,··· , n, j = 1,2,··· , p, h = 1,2,··· , H,
let zijh’s be normal random variables such that
Ezijh =phT
·jE(x|y∈ Ph);
V(zijh) = phT
·jE(xxT|y∈ Ph)·jp2
h(T
·jE(x|y∈ Ph))2;
Cov(zijh, zikh) = phT
·jE(xxT|y∈ Ph)·kp2
hT
·jE(x|y∈ Ph)T
·k
E(x|y∈ Ph);
Cov(zijh1, zikh2) = ph1ph2T
·jE(x|y∈ Ph1)T
·kE(x|y∈ Ph2).
摘要:

ONTHETESTINGOFMULTIPLEHYPOTHESISINSLICEDINVERSEREGRESSIONByZhigenZhao∗andXinXing†Weconsiderthemultipletestingofthegeneralregressionframe-workaimingatstudyingtherelationshipbetweenaunivariatere-sponseandap-dimensionalpredictor.Totestthehypothesisoftheeffectofeachpredictor,weconstructanAngularBalanced...

展开>> 收起<<
ON THE TESTING OF MULTIPLE HYPOTHESIS IN SLICED INVERSE REGRESSION By Zhigen Zhaoand Xin Xing.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:1.52MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注