
Partial Identification with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming
observability to the partial observability, and delete the guarantee for its reversibility (thus
P(W|U)−1
in Eqn.
(2)
may not exist). Specifically, we extend the identification region of
P(W|U)
from a fixed distribution to the family
P, such that:
P={P(W|U) : hP(W|U)−P(W|U), P (W|U)−P(W|U)iis non-negative},(3)
where
P(W|U)
and
P(W|U)
are two priori known matrices to bound
P(W|U)
. This is a common scenario in
the real-world. Although [
3
,
23
] have already generally corroborated that this partial observability is verifiable, it has
not been fully discussed in the recent literature. In our paper, we will reiterate the condition
P(W|U)∈P
as the
’partial observability assumption’ in our following text. Under this assumption, it is natural to set up our original goal -
seeking the lower bound of
f(y|do(x))
(upper bound is symmetric) via solving the following partial identification
problem:
f(y, X =x) + min
f(y,W,U,X)∈F \max
f(y,W,U,X)∈F
dim(U)
X
i=1
f(y, ui, X =x)f(ui, X 6=x)
f(ui, X =x).(4)
Here
f(y, W,U,X)
is a three-order (
dim(W)∗dim(U)∗dim(X)
) tensor indicating the joint probability distribution
of each
w∈W, u ∈U, x ∈X
together with
Y=y
. Then the set
F
= {
f(y, W,U,X) : f(y, W,U,X)
is
compatible with P(W|U)∈P}.
Achieving this goal faces with challenges. Firstly, its tight bound is hard to be achieved. It is due to the difficulty of
representing feasible region
F
in a closed form. Its boundary constraints contains the partial observable
P(W|U)
,
which can be seen as a well-known inverse problem called first-kind Fredholm integral equation
2
[
24
,
25
] in the discrete
case. It is ill-posed when
P(W|U)
is irreversible and the closed form expression of
F
can only be approximated
iteratively by complex numerical methods [
26
]. With this reason, we attempt to relax the feasible region from
F
to
e
F
(
F ⊆ e
F
), which contains a closed-form expression. Specifically, the relaxed condition
f(y, W,U,X)∈e
F
is to
keep the feasible region of
f(y, U, X =x), f(y, U, X =x), f(U, X 6=x)
in a calculable closed-form, which will be
denoted as IRF(y,U,X=x), IRF(U,X=x), IRF(U,X6=x)respectively in our final objective function.
Secondly, even if we retreat and seek its valid bound as above, it is still non-trivial. It is due to the difficulty of finding a
corresponding optimization method. As the causal effect is expressed as a form of fractional summation, we naturally
resort to techniques in sum-of-ratios fractional programming (SFP). The general form of SFP summarized in [
27
] is
represented as follows:
min{
M
X
i=1
g1i(φ)
g2i(φ)},φ∈S, g1i(φ)is convex, g2i(φ)is concave, g1i(φ)≥0, g2i(φ)>0,(5)
where
S
is a convex set, and
M≥2
is a integer. In order to ensure the global nature of optimal solutions,
g1i(Φ)
and
g2i(Φ) are assumed to be convex and concave respectively. In contrast with Formulation. 4, we should choose
M=dim(U),φ= ((f(y, ui, X =x), ...)T,(f(ui, X =x), ...)T,(f(ui, X 6=x), ...)T),
g1i(φ) = f(y, ui, X =x)f(ui, X 6=x), g2i(φ) = f(ui, X =x)i= 1,2, ...dim(U).(6)
However, our construction violates the traditional convex-concave assumption, since
g1i(φ)
is not convex. Thus these
previous SFP algorithms [27] do not work.
In order to handle this case, we design a algorithm called Partial Identification with Sum-of-ratios Fractional Program-
ming (PI-SFP). This algorithm is motivated by branch and bound strategy [
28
,
29
] and DC programming [
30
,
31
,
32
],
namely that we iteratively search the optimal bound by means of feasible region partition. We also provide the complete
convergence analysis. To our knowledge, our paper is a new attempt to estimating casual effect via this optimization
technique. Moreover, this algorithm also contributes to the existing literature on the convergence analysis in branch and
bound strategy [33, 28, 29, 32].
For recent literature, just because of these two challenges of solving the partial observability case, they avoided further
discussion on the observability of
P(W|U)
. Instead, they introduced another auxiliary variable
Z
and formalized the
problem as the double negative control [
16
,
17
,
4
,
18
,
6
,
7
,
19
,
8
]. However, as shown in Table. 1, there is no free lunch.
These work are also restricted by additional assumptions about
Z
, such as completeness condition, bridge function
condition, etc. Importantly, these work are still all based on the reversibility of
P(W|U)
just except for [
8
], who
2
One of the boundary constraints of
F
can be expressed as
f(y, w, x) = Pdim(U)
i=1 (f(wj|ui)Pdim(W)
j=1 f(y, wj, ui, x))
,
where each
f(w|u)
is bounded by
P(W|U)
and
P(W|U)
. It is in the form of the first-kind Fredholm integral equation in the
discrete case.
4