Disentangling Causal Effects from Sets of Interventions in the Presence of Unobserved Confounders

2025-04-24 0 0 1.13MB 14 页 10玖币

侵权投诉

Disentangling Causal Effects from Sets of

Interventions in the Presence of Unobserved

Confounders

Olivier Jeunen∗

Amazon

Edinburgh, UK

Ciarán M. Gilligan-Lee

Spotify & UCL

London, UK

Rishabh Mehrotra∗

Sharechat

London, UK

Mounia Lalmas

Spotify

London, UK

Abstract

The ability to answer causal questions is crucial in many domains, as causal infer-

ence allows one to understand the impact of interventions. In many applications,

only a single intervention is possible at a given time. However, in some important

areas, multiple interventions are concurrently applied. Disentangling the effects

of single interventions from jointly applied interventions is a challenging task—

especially as simultaneously applied interventions can interact. This problem is

made harder still by unobserved confounders, which inﬂuence both treatments

and outcome. We address this challenge by aiming to learn the effect of a single-

intervention from both observational data and sets of interventions. We prove

that this is not generally possible, but provide identiﬁcation proofs demonstrating

that it can be achieved under non-linear continuous structural causal models with

additive, multivariate Gaussian noise—even when unobserved confounders are

present. Importantly, we show how to incorporate observed covariates and learn

heterogeneous treatment effects. Based on the identiﬁability proofs, we provide an

algorithm that learns the causal model parameters by pooling data from different

regimes and jointly maximizing the combined likelihood. The effectiveness of our

method is empirically demonstrated on both synthetic and real-world data.

1 Introduction

The ability to answer causal questions is crucial in science, medicine, economics, and beyond,

see [

Gilligan-Lee, 2020

] for a high-level overview. This is because causal inference allows one to

understand the impact of interventions.

In many applications, only a single intervention is possible at

a given time, or interventions are applied one after another in a sequential manner. However, in some

important areas, multiple interventions are concurrently applied. For instance, in medicine, patients

that possess many commodities may have to be simultaneously treated with multiple prescriptions; in

computational advertising, people may be targeted by multiple concurrent campaigns; and in dietetics,

the nutritional content of meals can be considered a joint intervention from which we wish to learn

the effects of individual nutritional components.

Disentangling the effects of single interventions from jointly applied interventions is a challenging

task—especially as simultaneously applied interventions can interact, leading to consequences not

seen when considering single interventions separately. This problem is made harder still by the possi-

ble presence of unobserved confounders, which inﬂuence both treatments and outcome. This paper

addresses this challenge, by aiming to learn the effect of a single-intervention from both observational

∗Work done while author was at Spotify.

Causal inference also allows one to ask and answer counterfactual questions, see [

Perov et al., 2020

] and

[Vlontzos et al., 2022].

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.05446v1 [stat.ML] 11 Oct 2022

data and sets of interventions. We prove that this is not generally possible, but provide identiﬁ-

cation proofs demonstrating it can be achieved in certain classes of non-linear continuous causal

models with additive multivariate Gaussian noise—even in the presence of unobserved confounders.

This reasonably weak additive noise assumption is prevalent in the causal inference and discovery

literature [

Rolland et al., 2022

Saengkyongam and Silva, 2020

Kilbertus et al., 2020

]. Importantly,

we show how to incorporate observed covariates, which can be high-dimensional, and hence learn

heterogeneous treatment effects for single-interventions. Our main contributions are:

1. A proof that without restrictions on the causal model, single-intervention effects cannot be

identiﬁed from observations and joint-interventions. (§3.1,3.2)

Proofs that single-interventions can be identiﬁed from observations and joint-interventions

when the causal model belongs to certain (but not all) classes of non-linear continuous

structural causal models with additive, multivariate Gaussian noise. (§3.2,3.3)

An algorithm that learns the parameters of the proposed causal model and disentangles

single interventions from joint interventions. (§4)

4. An empirical validation of our method on both synthetic and real data.3(§5)

2 Related Work

Disentangling multiple concurrent interventions:

[

Parbhoo et al., 2021

] study the question of dis-

entangling multiple, simultaneously applied interventions from observational data. They propose a

specially designed neural network for the problem and show good empirical performance on some

datasets. However, they do not address the formal identiﬁcation problem, nor do they address possible

presence of unobserved confounders. By contrast our work derives the conditions under which identi-

ﬁability holds. We moreover propose an algorithm that can disentangle multiple interventions even

in the presence of unobserved confounders—as long as both observational and interventional data

is available. Related work by [

Parbhoo et al., 2020

] investigated the intervention-disentanglement

problem from a reinforcement learning perspective, where each intervention combination constitutes

a different action that a reinforcement learning agent can take. Unlike this approach, our work

explicitly focuses on modelling the interactions between interventions to learn their individual ef-

fects. Closer to our work is [

Saengkyongam and Silva, 2020

], who investigate identiﬁability of joint

effects from observations and single-intervention data. They prove this is not generally possible, but

provide an identiﬁcation proof for non-linear causal models with additive Gaussian noise. Our work

addresses a complementary question; we want to learn the effect of a single-intervention from obser-

vational data and sets of interventions. Additionally, another difference between our work and that of

[

Saengkyongam and Silva, 2020

] is that they do not consider identiﬁcation of individual-level causal

effects given observed covariates. In a precursor to the work by [

Saengkyongam and Silva, 2020

[

Nandy et al., 2017

] developed a method to estimate the effect of joint interventions from obser-

vational data when the causal structure is unknown. This approach assumed linear causal mod-

els with Gaussian noise, and only proved identiﬁability in this case under a sparsity constraint.

However, like [

Saengkyongam and Silva, 2020

], our result does not need the linearity assump-

tion, and no sparsity constraints are required in our identiﬁcation proof. Finally, others includ-

ing [

Schwab et al., 2020

Egami and Imai, 2018

Lopez and Gutman, 2017

Ghassami et al., 2021

]

explored how to estimate causal effects of a single categorical-, or continuous-valued treatment,

where different intervention values can produce different outcomes. Unlike our work, they do not

consider multiple concurrent interventions that can interact.

Combining observations and interventions:

[

Bareinboim and Pearl, 2016

] have investigated non-

parametric identiﬁability of causal effects using both observational and interventional data, in a

paradigm they call “data fusion.” More general results were studied by [

Lee et al., 2020

], who

provided necessary and sufﬁcient graphical conditions for identifying causal effects from arbitrary

combinations of observations and interventions. Recent work in [

Correa et al., 2021

] explored

identiﬁcation of counterfactual—as opposed to interventional—distributions from combinations of

observational and interventional data. Finally, [

Ilse et al., 2021

] investigated the most efﬁcient way

to combine observational and interventional data to estimate causal effects. They demonstrated they

could signiﬁcantly reduce the number of interventional samples required to achieve a certain ﬁt when

3The code to reproduce our results is available at github.com/olivierjeunen/disentangling-neurips-2022.

adding sufﬁcient observational training samples. However, they only prove their method theoretically

in the linear-Gaussian case. In the non-linear case, they parameterise their model using normalising

ﬂows and demonstrate their method empirically. They only consider estimating single-interventions,

and do not deal with multiple, interacting interventions.

Additive noise models:

While certain causal quantities may not be generally identiﬁable from obser-

vational and interventional data, by imposing restrictions on the structural functions underlying causal

models, one can obtain semi-parametric identiﬁability results. One of the most common weak restric-

tions used in the causal inference community are additive noise models (ANMs), ﬁrst studied in the

context of causal discovery by [

Hoyer et al., 2009

]—and still widely used today [

Rolland et al., 2022

ANMs limit the form of the structural equations to be additive with respect to latent noise variables—

but allow nonlinear interactions between causes. [

Janzing et al., 2009

] used ANMs to devise a method

for inferring a latent confounder between two observed variables. This is otherwise not possible

without additional assumptions on the underlying causal model. See [

Lee and Spekkens, 2017

[

Lee et al., 2019

], and [

Dhir and Lee, 2020

] for an extension of this approach beyond ANMs. ANMs

have also been employed by [

Kilbertus et al., 2020

] to investigate the sensitivity of counterfactual

notions of fairness to the presence of unobserved confounding. Our work proves that in certain

classes of ANMs, the effect of a single-intervention can be identiﬁed from observational data and sets

of interventions—even in the presence of unobserved confounders. Moreover, we show how to incor-

porate observed covariates in these ANMs to learn the heterogeneous effects of single-interventions.

3 Model Identiﬁability

Identiﬁability is a fundamental concept in parametric statistics that relates to the quantities that can,

or can not, be learned from data [

Rothenberg, 1971

]. An estimand is said to be identiﬁable from

data if it is theoretically possible to learn this estimand, given inﬁnite samples. If two causal models

coincide on said data then they must coincide on the value of the estimand in question. Hence if one

ﬁnds two causal models which agree on said data, but disagree on the estimand, then the estimand

is not identiﬁable unless further restrictions are imposed. In this section, we provide identiﬁcation

proofs for single-variable interventional effects from observational data and joint interventions, for

several model classes. Our theoretical analysis provides insights into the fundamental limitations of

causal inference—and the assumptions that are required for identiﬁcation.

Problem Deﬁnition.

We adopt the Structural Causal Model (SCM) framework as introduced by

[

Pearl, 2009

]. An SCM

is deﬁned by

h{C,X, Y },U,f,PUi

, where

{C,X, Y }

are endogenous

variables separated into covariates

, treatments

, and the outcome

are exogenous variables

(possibly confounders),

are structural equations, and

deﬁnes a joint probability distribution

over the exogenous variables.

The SCM

also induces a causal graph—where vertices represent endogenous variables, and

edges represent structural equations. Vertices with outgoing edges to an endogenous variable

are

denoted as the parent set of this variable, or

PA(Xi)

. Typically, the observed covariates

causally

inﬂuence the treatments as well as the outcome, and are a part of this set. Every endogenous variable

(including

) is then a function of its parents in the graph

PA(Xi)

and a latent noise term

denoting the inﬂuence of factors external to the model:

Xi:=fi(PA(Xi), Ui).(1)

In Markovian SCMs, these latent noise terms are all mutually independent. However, in general,

distinct noise terms can be correlated according to some global distribution

. In this case, such

correlation is due to the presence of unobserved confounders.

An intervention on variable

is denoted by

do(Xi=xi)

, and it corresponds to replacing its

structural equation with a constant, or removing all incoming edges in the causal graph. The core

question we wish to answer in this work, is under which conditions the treatment effect of a single

intervention can be disentangled from joint interventions and observational data. That is, given

samples from the data regimes that induce

E[Y|Xi=xi, Xj=xj, C =c],and E[Y|do(Xi=xi, Xj=xj), C =c],

when can we learn conditional average causal effects:

E[Y|do(Xi=xi), Xj=xj, C =c],or E[Y|Xi=xi,do(Xj=xj), C =c]?

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DisentanglingCausalEffectsfromSetsofInterventionsinthePresenceofUnobservedConfoundersOlivierJeunenAmazonEdinburgh,UKCiaránM.Gilligan-LeeSpotify&UCLLondon,UKRishabhMehrotraSharechatLondon,UKMouniaLalmasSpotifyLondon,UKAbstractTheabilitytoanswercausalquestionsiscrucialinmanydomains,ascausalinfer-enc...

展开>> 收起<<

Disentangling Causal Effects from Sets of Interventions in the Presence of Unobserved Confounders.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Disentangling Causal Effects from Sets of Interventions in the Presence of Unobserved Confounders

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: