PARTIAL IDENTIFICATION WITH PROXY OF LATENT CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL PROGRAMMING

2025-05-02 0 0 1016.96KB 33 页 10玖币
侵权投诉
PARTIAL IDENTIFICATION WITH PROXY OF LATENT
CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL
PROGRAMMING
Zhiheng Zhang
Institute for Interdisciplinary Information Sciences, Tsinghua University
zhiheng-20@mails.tsinghua.edu.cn
ABSTRACT
Due to the unobservability of confoundings, there has been a widespread concern on how to compute
causality quantitatively. To address this challenge, proxy based negative control approaches have
been commonly adopted, where auxiliary outcome variables
W
are introduced as the proxy of
confoundings
U
. However, these approaches rely on strong assumptions such as reversibility,
completeness or bridge functions. These assumptions lack intuitive empirical interpretation and
solid verification technique, hence their applications in the real world is limited. For instance, these
approaches are inapplicable when the transition matrix
P(W|U)
is irreversible. In this paper, we
focus on a weaker assumption called the partial observability of
P(W|U)
. We develop a more
general single-proxy negative control method called Partial Identification via Sum-of-ratios Fractional
Programming (PI-SFP). It is a global optimization algorithm based on the branch-and-bound strategy,
aiming to provide the valid bound of the causal effect. In simulation, PI-SFP provides promising
numerical results, and fill in the blank spots that can not be handled in the previous literature, such as
we have partial information of P(W|U).
Keywords Causality; Partial identification; Fractional programming; Branch-and-bound; Average causal effect
Contents
1 Introduction 2
2 Preliminaries 3
3 A fractional programming framework for partial identification 5
3.1 Denitionsandassumptions ....................................... 5
3.2 Objectivefunction............................................. 6
3.3 ValidboundofACE............................................ 8
4 Algorithm 9
4.1 FrameworkofPI-SFP........................................... 9
4.2 InternalfunctionsofPI-SFP ....................................... 10
5 Theoretical analysis 11
arXiv:2210.09885v2 [math.OC] 8 Nov 2022
Partial Identification with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming
6 Simulations 12
7 Further discussions and extensions 14
7.1 Discussionsonassumptions ....................................... 14
7.2 Discussiononalgorithms......................................... 15
7.3 Discussion on graphical structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
7.4 Discussions on extensions to the continuous confoundings . . . . . . . . . . . . . . . . . . . . . . . 16
8 Conclusions 16
9 Acknowledgement 16
A.1 Theproofofproposition.1........................................ 19
A.2 Theproofofproposition.2........................................ 19
A.3 FurtherdiscussiononAss.2 ....................................... 20
A.4 TheproofofTheorem.1 ......................................... 20
A.5 ExtensiontotheACEcase ........................................ 27
A.6 The proof of further discussions and extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1 Introduction
Identifying causal effects from observational data is a fundamental question in economics, social science and epi-
demiology [
1
]. Causal inference is usually challenging with the existence of latent confoundings, which impedes us
from extracting useful causal information from directly applying statistical association studies [
1
]. In order to adjust
for the bias incurred from latent confoundings, people usually need to reply on auxiliary variables for confounding
adjustment. These auxiliary variables mainly include instrument variable (IV) method [
2
], and proximal variables [
3
,
4
],
or both [
5
,
6
,
7
,
8
]. In this paper, we are primarily interested in causal identification with proxies of latent confounders.
Fig. 1(a) and Fig. 1(b) are examples of such methods, where to identify the causal effect of
X
towards
Y
, we assume
there are additional random variables such as
W
or
Z
that are aossicated with the latent confounders, and use these
variables as “proxies” of latent founders for confounding adjustment.
Empirical studies on using proxies for confounding adjustment has a long history, the earliest could be traced back
to [
9
], in which the authors analyzed the potential benefit of using proxies as an alternative of latent confounding in
least square estimations. This was further applied in observational studies such as [
10
,
11
]. Other empirical studies
include [
12
,
13
]. On the theoretical side, existing research could be mainly splitted into two categories, the first is the
“single-proxy scenario”, where we assume there is only a single proxy variable (Figure 1(a)); and the second is the
“double-proxy scenario”, where we have access to data from at least two proxy variables (Figures 1(b), 1(c)).
Figure 1(a) illustrates the causal diagram considered in the single-proxy scenario. When both
W
and
U
are discrete
random variables with finite number of choices, the state of the art research include [
14
]. More specifically, they prove
that the true causal mechanism
p(y|do(x))
is identifiable when the probability transition matrix
P(W|U)
is fully
observable and invertible.
When
P(W|U)
is not observable, Pearl further considered the double-proxy cases extended from [
15
], where the
exposure proxy control
Z
and the outcome proxy control
W
both exist. With the auxiliary of
Z
, the reversibility of
P(W|U)
is strengthened to that of
P(Z,W|x)
and
P(y, Z,W|x)
. This was further extended to a new topic
’negative control’ [
16
,
17
,
4
,
18
,
6
,
7
,
19
,
8
]. In all, these work are all cursed by certain reversibility or completeness
assumptions or their weaker forms.
Double proxy requires observing new auxilliary variable, which may not be practical in real applications, in this paper,
we revisit the single proxy case. In conclusion, excessively strong conditions on
P(W|U)
are imposed to sufficiently
achieve the accurate value of ACE. In this paper, we propose new algorithms to identify a bound of the causal effect
when the probability transition matrix
P(W|U)
is only partially observable. Moreover, our method does not require
the
P(W|U)
to be invertible for the desired identification guarantee. Our algorithm is a fractional programming
based approach which seeks learning a bound of the causal effect via solving a constraint fractional program.
2
Partial Identification with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming
X
U
Y
W
(a)
X
ZU
Y
W
(b)
X
ZU
Y
W
(c)
FIGURE 1: Estimating ACE with confoundings via (a) single control or (b)(c) double control. The nodes denote:
Z, W proxies, Xtreatment, Youtcome, and Uunobserved confoundings.
By this motivation, in our paper, we focus on the single-proxy case and attempt to weaken the condition ’total precise
observability’ of
P(W|U)
to ’partial observability’. That is, for each
dim(W)
-dimensional vector
P(W|U=u)
,
we only assume that it is located in a particular subarea instead of a fixed point in the
dim(W)
-dimensional space.
More importantly, such completeness/reversibility conditions in the previous proxy control is not required. On this
basis, we formulate this as a constraint fractional programming problem and developed new optimization approaches to
solve this problem. This is different from traditional fractional programming methods [
20
], since the original strong
assumptions about the concavity do not exist. To summarize, compared with the previous literature, we quest for the
partial identification of ACE, rather than its unique closed form, under the weaker assumption.
The paper is organized as follows. In section. 1, we introduce the basic knowledge of partial identification. In section. 2,
we review the construction of ACE and the evolution of the relevant hypotheses in the previous literature. To address
their shortcomings, our new hypothesis is proposed, which possesses deeper intuitivity, applicability, and verifiability.
In section. 3, we establish the estimation of ACE as a sum-of-ratios fractional programming problem. In formulation,
we explicitly construct our objective function and the identification region of the solutions. Then in section. 4. we solve
our problem by branch-and-bound strategy in practice. In section. 5, we focus on the theoretical global convergence
property of our algorithm. In section. 6, we make simulations to show the effectiveness of our algorithm. Finally, in
section. 7, we provide several further topics and discussions to illustrate the great generalizability and scalability of our
approach.
2 Preliminaries
The causal effect is strongly related to the ’do’ operator [
21
,
22
], which can be seen as an external intervention.
Specifically, the causal effect of treatment
X
on outcome
Y
is denoted as
f(y|do(x))
in Fig. 1, where the symbol
do(x)
represents that the treatment
X
is forced to be a fixed value
x
, and
f(·)
denotes the probability mass/density
function for discrete/continuous variables. According to the back-door criteria [
21
],
f(y|do(x))
is identified as
follows:
f(y|do(x)) =
dim(U)
X
i=1
f(y|ui, x)f(ui) = f(y, x) +
dim(U)
X
i=1
f(y, ui, X =x)f(ui, X 6=x)
f(ui, X =x),(1)
where
dim(·)
denotes the dimension of the variables. The decomposition in the second equation is due to
f(ui) =
f(ui, X =x) + f(ui, X 6=x)
. In [
3
], they assumed that the transition matrix
P(W|U)
is totally observable and
reversible. Then they claimed that
f(y|do(x))
is identifiable, namely that the value of each item in Eqn.
(1)
can be
explicitly extracted as follows1:
"f(y, U, X =x)
f(U, X =x)
f(U, X 6=x)#=P(W|U)1"f(y, W, X =x)
f(W, X =x)
f(W, X 6=x)#.(2)
Our paper is for generalization. We consider the partial identification of
f(y|do(x))
instead of its unique form
computation. This is due to our weakening of assumption on
P(W|U)
. Compared with [
3
], we relax its total
1
For convenience in our paper, we use the bold letters to denote the column vector formed by all its corresponding possible
values. For instance,
f(y, U, X =x)=[f(y, u1, X =x), f(y, u2, X =x), ...f (y, udim(U), X =x)]T
. Moreover, if there are
two bold letters in a symbol such as
P(W|U)
, it denotes the matrix namely that
[f(W|u1), f(W|u2), ...f (W|udim(U))]
,
where f(W|ui) = [f(w1|ui), f(w2|ui), ...f (wdim(W)|ui)]T, i = 1,2, ...dim(U).
3
Partial Identification with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming
observability to the partial observability, and delete the guarantee for its reversibility (thus
P(W|U)1
in Eqn.
(2)
may not exist). Specifically, we extend the identification region of
P(W|U)
from a fixed distribution to the family
P, such that:
P={P(W|U) : hP(W|U)P(W|U), P (W|U)P(W|U)iis non-negative},(3)
where
P(W|U)
and
P(W|U)
are two priori known matrices to bound
P(W|U)
. This is a common scenario in
the real-world. Although [
3
,
23
] have already generally corroborated that this partial observability is verifiable, it has
not been fully discussed in the recent literature. In our paper, we will reiterate the condition
P(W|U)P
as the
’partial observability assumption’ in our following text. Under this assumption, it is natural to set up our original goal -
seeking the lower bound of
f(y|do(x))
(upper bound is symmetric) via solving the following partial identification
problem:
f(y, X =x) + min
f(y,W,U,X)∈F \max
f(y,W,U,X)∈F
dim(U)
X
i=1
f(y, ui, X =x)f(ui, X 6=x)
f(ui, X =x).(4)
Here
f(y, W,U,X)
is a three-order (
dim(W)dim(U)dim(X)
) tensor indicating the joint probability distribution
of each
wW, u U, x X
together with
Y=y
. Then the set
F
= {
f(y, W,U,X) : f(y, W,U,X)
is
compatible with P(W|U)P}.
Achieving this goal faces with challenges. Firstly, its tight bound is hard to be achieved. It is due to the difficulty of
representing feasible region
F
in a closed form. Its boundary constraints contains the partial observable
P(W|U)
,
which can be seen as a well-known inverse problem called first-kind Fredholm integral equation
2
[
24
,
25
] in the discrete
case. It is ill-posed when
P(W|U)
is irreversible and the closed form expression of
F
can only be approximated
iteratively by complex numerical methods [
26
]. With this reason, we attempt to relax the feasible region from
F
to
e
F
(
F ⊆ e
F
), which contains a closed-form expression. Specifically, the relaxed condition
f(y, W,U,X)e
F
is to
keep the feasible region of
f(y, U, X =x), f(y, U, X =x), f(U, X 6=x)
in a calculable closed-form, which will be
denoted as IRF(y,U,X=x), IRF(U,X=x), IRF(U,X6=x)respectively in our final objective function.
Secondly, even if we retreat and seek its valid bound as above, it is still non-trivial. It is due to the difficulty of finding a
corresponding optimization method. As the causal effect is expressed as a form of fractional summation, we naturally
resort to techniques in sum-of-ratios fractional programming (SFP). The general form of SFP summarized in [
27
] is
represented as follows:
min{
M
X
i=1
g1i(φ)
g2i(φ)},φS, g1i(φ)is convex, g2i(φ)is concave, g1i(φ)0, g2i(φ)>0,(5)
where
S
is a convex set, and
M2
is a integer. In order to ensure the global nature of optimal solutions,
g1i(Φ)
and
g2i(Φ) are assumed to be convex and concave respectively. In contrast with Formulation. 4, we should choose
M=dim(U),φ= ((f(y, ui, X =x), ...)T,(f(ui, X =x), ...)T,(f(ui, X 6=x), ...)T),
g1i(φ) = f(y, ui, X =x)f(ui, X 6=x), g2i(φ) = f(ui, X =x)i= 1,2, ...dim(U).(6)
However, our construction violates the traditional convex-concave assumption, since
g1i(φ)
is not convex. Thus these
previous SFP algorithms [27] do not work.
In order to handle this case, we design a algorithm called Partial Identification with Sum-of-ratios Fractional Program-
ming (PI-SFP). This algorithm is motivated by branch and bound strategy [
28
,
29
] and DC programming [
30
,
31
,
32
],
namely that we iteratively search the optimal bound by means of feasible region partition. We also provide the complete
convergence analysis. To our knowledge, our paper is a new attempt to estimating casual effect via this optimization
technique. Moreover, this algorithm also contributes to the existing literature on the convergence analysis in branch and
bound strategy [33, 28, 29, 32].
For recent literature, just because of these two challenges of solving the partial observability case, they avoided further
discussion on the observability of
P(W|U)
. Instead, they introduced another auxiliary variable
Z
and formalized the
problem as the double negative control [
16
,
17
,
4
,
18
,
6
,
7
,
19
,
8
]. However, as shown in Table. 1, there is no free lunch.
These work are also restricted by additional assumptions about
Z
, such as completeness condition, bridge function
condition, etc. Importantly, these work are still all based on the reversibility of
P(W|U)
just except for [
8
], who
2
One of the boundary constraints of
F
can be expressed as
f(y, w, x) = Pdim(U)
i=1 (f(wj|ui)Pdim(W)
j=1 f(y, wj, ui, x))
,
where each
f(w|u)
is bounded by
P(W|U)
and
P(W|U)
. It is in the form of the first-kind Fredholm integral equation in the
discrete case.
4
Partial Identification with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming
Literature Tools Assumptions
Valid
Instrument Negative
exposure Negative
outcome reversibility
completeness Bridge
function Observability
of P(W|U)
[34]
[35] ! # # # # #
[3](1)
[13]
[36]
# # ! ! # !a
[3](2)
[19] # ! ! ! # #
[16]
[6]
[7]
[17]
[4]
[18]
# ! ! ! ! #
[8] # ! ! # ! #
Our paper $ $ " $ $ "b
TABLE 1: Tools and assumptions of previous literature on partial identification. [
3
]
(1)
is with external studies, while
(2) is without external studies.
aP(W|U)is assumed to be reversible and explicitly, totally observed.
bIn our paper, P(W|U)only needs be partially bounded.
substituted it as a weaker bridge function condition. Hence when the irreversibility (i.e., multilinearity in some rows or
columns) of the
P(W|U)
occurs in our real world, these methods will be easily invalidated. In fact, difficulties have
already been encountered when doing numerical computations if the conditional number of P(W|U)is too large3.
In conclusion, the revisit of single-proxy control under the partial observability of
P(W|U)
is challenging but
necessary. It corresponds to a few common real-world scenarios, serving as the blank spots of these double-proxy
control methods as illustrated above. In our paper, we will do a deeper exploration on estimating causal effect with this
assumption. We propose algorithm called PI-SFP. Our contributions are summarized as follows:
We propose a novel analytical framework of seeking the valid bound of causal effect
f(y|do(x))
via the
partial observability of
P(W|U)
, and provide a sufficient and necessary condition to justify whether the
bound is tight or not.
We develop a global optimization strategy called Partial Identification via Sum-of-ratios Fractional program-
ming (PI-SFP). We theoretically prove that PI-SFP algorithm globally converges and can achieve the valid
bound of f(y|do(x)) in an exponential rate.
We analyze the rationality and generalizability of PI-SFP via extended discussions, such as 1) motivation of
partial observability assumption, 2) acceleration of PI-SFP, 3) graph structure extension and 4) generalization
to the continuous confoundings.
3 A fractional programming framework for partial identification
3.1 Definitions and assumptions
Based on preliminaries, we reorganize all definitions and assumptions as follows.
Definition 1 Y, Y0, Y1[YL, Y U], Z [ZL, ZU], X [XL, XU]
,
W[WL, W U]
,
U[UL, UU]
. Moreover, we
use dim(·)to denote the dimension of variables. d:= dim(U)<+, and set of confoundings Uis {u1, u2, ...ud}.
3
The conditional number of matrix
A
is denoted as
κ(A) = σmax (A)
σmin(A)
, where
σmax(A)
,
σmin(A)
denote the maximal/minimal
singular values of
A
. If some rows/columns of
A
are similar (or equal), then
κ(A)
is large (or
+
), and
A1
is computationally
hard (or even not exists.)
5
摘要:

PARTIALIDENTIFICATIONWITHPROXYOFLATENTCONFOUNDINGSVIASUM-OF-RATIOSFRACTIONALPROGRAMMINGZhihengZhangInstituteforInterdisciplinaryInformationSciences,TsinghuaUniversityzhiheng-20@mails.tsinghua.edu.cnABSTRACTDuetotheunobservabilityofconfoundings,therehasbeenawidespreadconcernonhowtocomputecausalityqua...

展开>> 收起<<
PARTIAL IDENTIFICATION WITH PROXY OF LATENT CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL PROGRAMMING.pdf

共33页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:33 页 大小:1016.96KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 33
客服
关注