PARTIAL IDENTIFICATION WITH PROXY OF LATENT CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL PROGRAMMING

2025-05-02 0 0 1016.96KB 33 页 10玖币

侵权投诉

PARTIAL IDENTIFICATION WITH PROXY OF LATENT

CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL

PROGRAMMING

Zhiheng Zhang

Institute for Interdisciplinary Information Sciences, Tsinghua University

zhiheng-20@mails.tsinghua.edu.cn

ABSTRACT

Due to the unobservability of confoundings, there has been a widespread concern on how to compute

causality quantitatively. To address this challenge, proxy based negative control approaches have

been commonly adopted, where auxiliary outcome variables

are introduced as the proxy of

confoundings

. However, these approaches rely on strong assumptions such as reversibility,

completeness or bridge functions. These assumptions lack intuitive empirical interpretation and

solid veriﬁcation technique, hence their applications in the real world is limited. For instance, these

approaches are inapplicable when the transition matrix

P(W|U)

is irreversible. In this paper, we

focus on a weaker assumption called the partial observability of

P(W|U)

. We develop a more

general single-proxy negative control method called Partial Identiﬁcation via Sum-of-ratios Fractional

Programming (PI-SFP). It is a global optimization algorithm based on the branch-and-bound strategy,

aiming to provide the valid bound of the causal effect. In simulation, PI-SFP provides promising

numerical results, and ﬁll in the blank spots that can not be handled in the previous literature, such as

we have partial information of P(W|U).

Keywords Causality; Partial identiﬁcation; Fractional programming; Branch-and-bound; Average causal effect

Contents

1 Introduction 2

2 Preliminaries 3

3 A fractional programming framework for partial identiﬁcation 5

3.1 Deﬁnitionsandassumptions ....................................... 5

3.2 Objectivefunction............................................. 6

3.3 ValidboundofACE............................................ 8

4 Algorithm 9

4.1 FrameworkofPI-SFP........................................... 9

4.2 InternalfunctionsofPI-SFP ....................................... 10

5 Theoretical analysis 11

arXiv:2210.09885v2 [math.OC] 8 Nov 2022

Partial Identiﬁcation with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming

6 Simulations 12

7 Further discussions and extensions 14

7.1 Discussionsonassumptions ....................................... 14

7.2 Discussiononalgorithms......................................... 15

7.3 Discussion on graphical structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

7.4 Discussions on extensions to the continuous confoundings . . . . . . . . . . . . . . . . . . . . . . . 16

8 Conclusions 16

9 Acknowledgement 16

A.1 Theproofofproposition.1........................................ 19

A.2 Theproofofproposition.2........................................ 19

A.3 FurtherdiscussiononAss.2 ....................................... 20

A.4 TheproofofTheorem.1 ......................................... 20

A.5 ExtensiontotheACEcase ........................................ 27

A.6 The proof of further discussions and extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1 Introduction

Identifying causal effects from observational data is a fundamental question in economics, social science and epi-

demiology [

]. Causal inference is usually challenging with the existence of latent confoundings, which impedes us

from extracting useful causal information from directly applying statistical association studies [

]. In order to adjust

for the bias incurred from latent confoundings, people usually need to reply on auxiliary variables for confounding

adjustment. These auxiliary variables mainly include instrument variable (IV) method [

], and proximal variables [

or both [

]. In this paper, we are primarily interested in causal identiﬁcation with proxies of latent confounders.

Fig. 1(a) and Fig. 1(b) are examples of such methods, where to identify the causal effect of

towards

, we assume

there are additional random variables such as

that are aossicated with the latent confounders, and use these

variables as “proxies” of latent founders for confounding adjustment.

Empirical studies on using proxies for confounding adjustment has a long history, the earliest could be traced back

to [

], in which the authors analyzed the potential beneﬁt of using proxies as an alternative of latent confounding in

least square estimations. This was further applied in observational studies such as [

]. Other empirical studies

include [

]. On the theoretical side, existing research could be mainly splitted into two categories, the ﬁrst is the

“single-proxy scenario”, where we assume there is only a single proxy variable (Figure 1(a)); and the second is the

“double-proxy scenario”, where we have access to data from at least two proxy variables (Figures 1(b), 1(c)).

Figure 1(a) illustrates the causal diagram considered in the single-proxy scenario. When both

and

are discrete

random variables with ﬁnite number of choices, the state of the art research include [

]. More speciﬁcally, they prove

that the true causal mechanism

p(y|do(x))

is identiﬁable when the probability transition matrix

P(W|U)

is fully

observable and invertible.

When

P(W|U)

is not observable, Pearl further considered the double-proxy cases extended from [

], where the

exposure proxy control

and the outcome proxy control

both exist. With the auxiliary of

, the reversibility of

P(W|U)

is strengthened to that of

P(Z,W|x)

and

P(y, Z,W|x)

. This was further extended to a new topic

’negative control’ [

]. In all, these work are all cursed by certain reversibility or completeness

assumptions or their weaker forms.

Double proxy requires observing new auxilliary variable, which may not be practical in real applications, in this paper,

we revisit the single proxy case. In conclusion, excessively strong conditions on

P(W|U)

are imposed to sufﬁciently

achieve the accurate value of ACE. In this paper, we propose new algorithms to identify a bound of the causal effect

when the probability transition matrix

P(W|U)

is only partially observable. Moreover, our method does not require

the

P(W|U)

to be invertible for the desired identiﬁcation guarantee. Our algorithm is a fractional programming

based approach which seeks learning a bound of the causal effect via solving a constraint fractional program.

Partial Identiﬁcation with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming

(a)

(b)

(c)

FIGURE 1: Estimating ACE with confoundings via (a) single control or (b)(c) double control. The nodes denote:

Z, W −proxies, X−treatment, Y−outcome, and U−unobserved confoundings.

By this motivation, in our paper, we focus on the single-proxy case and attempt to weaken the condition ’total precise

observability’ of

P(W|U)

to ’partial observability’. That is, for each

dim(W)

-dimensional vector

P(W|U=u)

we only assume that it is located in a particular subarea instead of a ﬁxed point in the

dim(W)

-dimensional space.

More importantly, such completeness/reversibility conditions in the previous proxy control is not required. On this

basis, we formulate this as a constraint fractional programming problem and developed new optimization approaches to

solve this problem. This is different from traditional fractional programming methods [

], since the original strong

assumptions about the concavity do not exist. To summarize, compared with the previous literature, we quest for the

partial identiﬁcation of ACE, rather than its unique closed form, under the weaker assumption.

The paper is organized as follows. In section. 1, we introduce the basic knowledge of partial identiﬁcation. In section. 2,

we review the construction of ACE and the evolution of the relevant hypotheses in the previous literature. To address

their shortcomings, our new hypothesis is proposed, which possesses deeper intuitivity, applicability, and veriﬁability.

In section. 3, we establish the estimation of ACE as a sum-of-ratios fractional programming problem. In formulation,

we explicitly construct our objective function and the identiﬁcation region of the solutions. Then in section. 4. we solve

our problem by branch-and-bound strategy in practice. In section. 5, we focus on the theoretical global convergence

property of our algorithm. In section. 6, we make simulations to show the effectiveness of our algorithm. Finally, in

section. 7, we provide several further topics and discussions to illustrate the great generalizability and scalability of our

approach.

2 Preliminaries

The causal effect is strongly related to the ’do’ operator [

], which can be seen as an external intervention.

Speciﬁcally, the causal effect of treatment

on outcome

is denoted as

f(y|do(x))

in Fig. 1, where the symbol

do(x)

represents that the treatment

is forced to be a ﬁxed value

, and

f(·)

denotes the probability mass/density

function for discrete/continuous variables. According to the back-door criteria [

f(y|do(x))

is identiﬁed as

follows:

f(y|do(x)) =

dim(U)

i=1

f(y|ui, x)f(ui) = f(y, x) +

dim(U)

i=1

f(y, ui, X =x)f(ui, X 6=x)

f(ui, X =x),(1)

where

dim(·)

denotes the dimension of the variables. The decomposition in the second equation is due to

f(ui) =

f(ui, X =x) + f(ui, X 6=x)

. In [

], they assumed that the transition matrix

P(W|U)

is totally observable and

reversible. Then they claimed that

f(y|do(x))

is identiﬁable, namely that the value of each item in Eqn.

(1)

can be

explicitly extracted as follows1:

"f(y, U, X =x)

f(U, X =x)

f(U, X 6=x)#=P(W|U)−1"f(y, W, X =x)

f(W, X =x)

f(W, X 6=x)#.(2)

Our paper is for generalization. We consider the partial identiﬁcation of

f(y|do(x))

instead of its unique form

computation. This is due to our weakening of assumption on

P(W|U)

. Compared with [

], we relax its total

For convenience in our paper, we use the bold letters to denote the column vector formed by all its corresponding possible

values. For instance,

f(y, U, X =x)=[f(y, u1, X =x), f(y, u2, X =x), ...f (y, udim(U), X =x)]T

. Moreover, if there are

two bold letters in a symbol such as

P(W|U)

, it denotes the matrix namely that

[f(W|u1), f(W|u2), ...f (W|udim(U))]

where f(W|ui) = [f(w1|ui), f(w2|ui), ...f (wdim(W)|ui)]T, i = 1,2, ...dim(U).

Partial Identiﬁcation with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming

observability to the partial observability, and delete the guarantee for its reversibility (thus

P(W|U)−1

in Eqn.

(2)

may not exist). Speciﬁcally, we extend the identiﬁcation region of

P(W|U)

from a ﬁxed distribution to the family

P, such that:

P={P(W|U) : hP(W|U)−P(W|U), P (W|U)−P(W|U)iis non-negative},(3)

where

P(W|U)

and

P(W|U)

are two priori known matrices to bound

P(W|U)

. This is a common scenario in

the real-world. Although [

] have already generally corroborated that this partial observability is veriﬁable, it has

not been fully discussed in the recent literature. In our paper, we will reiterate the condition

P(W|U)∈P

as the

’partial observability assumption’ in our following text. Under this assumption, it is natural to set up our original goal -

seeking the lower bound of

f(y|do(x))

(upper bound is symmetric) via solving the following partial identiﬁcation

problem:

f(y, X =x) + min

f(y,W,U,X)∈F \max

f(y,W,U,X)∈F

dim(U)

i=1

f(y, ui, X =x)f(ui, X 6=x)

f(ui, X =x).(4)

Here

f(y, W,U,X)

is a three-order (

dim(W)∗dim(U)∗dim(X)

) tensor indicating the joint probability distribution

of each

w∈W, u ∈U, x ∈X

together with

Y=y

. Then the set

= {

f(y, W,U,X) : f(y, W,U,X)

compatible with P(W|U)∈P}.

Achieving this goal faces with challenges. Firstly, its tight bound is hard to be achieved. It is due to the difﬁculty of

representing feasible region

in a closed form. Its boundary constraints contains the partial observable

P(W|U)

which can be seen as a well-known inverse problem called ﬁrst-kind Fredholm integral equation

[

] in the discrete

case. It is ill-posed when

P(W|U)

is irreversible and the closed form expression of

can only be approximated

iteratively by complex numerical methods [

]. With this reason, we attempt to relax the feasible region from

(

F ⊆ e

), which contains a closed-form expression. Speciﬁcally, the relaxed condition

f(y, W,U,X)∈e

is to

keep the feasible region of

f(y, U, X =x), f(y, U, X =x), f(U, X 6=x)

in a calculable closed-form, which will be

denoted as IRF(y,U,X=x), IRF(U,X=x), IRF(U,X6=x)respectively in our ﬁnal objective function.

Secondly, even if we retreat and seek its valid bound as above, it is still non-trivial. It is due to the difﬁculty of ﬁnding a

corresponding optimization method. As the causal effect is expressed as a form of fractional summation, we naturally

resort to techniques in sum-of-ratios fractional programming (SFP). The general form of SFP summarized in [

] is

represented as follows:

min{

i=1

g1i(φ)

g2i(φ)},φ∈S, g1i(φ)is convex, g2i(φ)is concave, g1i(φ)≥0, g2i(φ)>0,(5)

where

is a convex set, and

M≥2

is a integer. In order to ensure the global nature of optimal solutions,

g1i(Φ)

and

g2i(Φ) are assumed to be convex and concave respectively. In contrast with Formulation. 4, we should choose

M=dim(U),φ= ((f(y, ui, X =x), ...)T,(f(ui, X =x), ...)T,(f(ui, X 6=x), ...)T),

g1i(φ) = f(y, ui, X =x)f(ui, X 6=x), g2i(φ) = f(ui, X =x)i= 1,2, ...dim(U).(6)

However, our construction violates the traditional convex-concave assumption, since

g1i(φ)

is not convex. Thus these

previous SFP algorithms [27] do not work.

In order to handle this case, we design a algorithm called Partial Identiﬁcation with Sum-of-ratios Fractional Program-

ming (PI-SFP). This algorithm is motivated by branch and bound strategy [

] and DC programming [

namely that we iteratively search the optimal bound by means of feasible region partition. We also provide the complete

convergence analysis. To our knowledge, our paper is a new attempt to estimating casual effect via this optimization

technique. Moreover, this algorithm also contributes to the existing literature on the convergence analysis in branch and

bound strategy [33, 28, 29, 32].

For recent literature, just because of these two challenges of solving the partial observability case, they avoided further

discussion on the observability of

P(W|U)

. Instead, they introduced another auxiliary variable

and formalized the

problem as the double negative control [

]. However, as shown in Table. 1, there is no free lunch.

These work are also restricted by additional assumptions about

, such as completeness condition, bridge function

condition, etc. Importantly, these work are still all based on the reversibility of

P(W|U)

just except for [

], who

One of the boundary constraints of

can be expressed as

f(y, w, x) = Pdim(U)

i=1 (f(wj|ui)Pdim(W)

j=1 f(y, wj, ui, x))

where each

f(w|u)

is bounded by

P(W|U)

and

P(W|U)

. It is in the form of the ﬁrst-kind Fredholm integral equation in the

discrete case.

Partial Identiﬁcation with Proxy of Latent Confoundings via Sum-of-ratios Fractional Programming

Literature Tools Assumptions

Valid

Instrument Negative

exposure Negative

outcome reversibility

completeness Bridge

function Observability

of P(W|U)

[34]

[35] ! # # # # #

[3](1)

[13]

[36]

# # ! ! # !a

[3](2)

[19] # ! ! ! # #

[16]

[6]

[7]

[17]

[4]

[18]

# ! ! ! ! #

[8] # ! ! # ! #

Our paper $ $ " $ $ "b

TABLE 1: Tools and assumptions of previous literature on partial identiﬁcation. [

]

(1)

is with external studies, while

(2) is without external studies.

aP(W|U)is assumed to be reversible and explicitly, totally observed.

bIn our paper, P(W|U)only needs be partially bounded.

substituted it as a weaker bridge function condition. Hence when the irreversibility (i.e., multilinearity in some rows or

columns) of the

P(W|U)

occurs in our real world, these methods will be easily invalidated. In fact, difﬁculties have

already been encountered when doing numerical computations if the conditional number of P(W|U)is too large3.

In conclusion, the revisit of single-proxy control under the partial observability of

P(W|U)

is challenging but

necessary. It corresponds to a few common real-world scenarios, serving as the blank spots of these double-proxy

control methods as illustrated above. In our paper, we will do a deeper exploration on estimating causal effect with this

assumption. We propose algorithm called PI-SFP. Our contributions are summarized as follows:

•

We propose a novel analytical framework of seeking the valid bound of causal effect

f(y|do(x))

via the

partial observability of

P(W|U)

, and provide a sufﬁcient and necessary condition to justify whether the

bound is tight or not.

•

We develop a global optimization strategy called Partial Identiﬁcation via Sum-of-ratios Fractional program-

ming (PI-SFP). We theoretically prove that PI-SFP algorithm globally converges and can achieve the valid

bound of f(y|do(x)) in an exponential rate.

•

We analyze the rationality and generalizability of PI-SFP via extended discussions, such as 1) motivation of

partial observability assumption, 2) acceleration of PI-SFP, 3) graph structure extension and 4) generalization

to the continuous confoundings.

3 A fractional programming framework for partial identiﬁcation

3.1 Deﬁnitions and assumptions

Based on preliminaries, we reorganize all deﬁnitions and assumptions as follows.

Deﬁnition 1 Y, Y0, Y1∈[YL, Y U], Z ∈[ZL, ZU], X ∈[XL, XU]

W∈[WL, W U]

U∈[UL, UU]

. Moreover, we

use dim(·)to denote the dimension of variables. d:= dim(U)<+∞, and set of confoundings Uis {u1, u2, ...ud}.

The conditional number of matrix

is denoted as

κ(A) = σmax (A)

σmin(A)

, where

σmax(A)

σmin(A)

denote the maximal/minimal

singular values of

. If some rows/columns of

are similar (or equal), then

κ(A)

is large (or

+∞

), and

A−1

is computationally

hard (or even not exists.)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PARTIALIDENTIFICATIONWITHPROXYOFLATENTCONFOUNDINGSVIASUM-OF-RATIOSFRACTIONALPROGRAMMINGZhihengZhangInstituteforInterdisciplinaryInformationSciences,TsinghuaUniversityzhiheng-20@mails.tsinghua.edu.cnABSTRACTDuetotheunobservabilityofconfoundings,therehasbeenawidespreadconcernonhowtocomputecausalityqua...

展开>> 收起<<

PARTIAL IDENTIFICATION WITH PROXY OF LATENT CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL PROGRAMMING.pdf

共33页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

PARTIAL IDENTIFICATION WITH PROXY OF LATENT CONFOUNDINGS VIA SUM-OF-RATIOS FRACTIONAL PROGRAMMING

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: