
data and sets of interventions. We prove that this is not generally possible, but provide identifi-
cation proofs demonstrating it can be achieved in certain classes of non-linear continuous causal
models with additive multivariate Gaussian noise—even in the presence of unobserved confounders.
This reasonably weak additive noise assumption is prevalent in the causal inference and discovery
literature [
Rolland et al., 2022
,
Saengkyongam and Silva, 2020
,
Kilbertus et al., 2020
]. Importantly,
we show how to incorporate observed covariates, which can be high-dimensional, and hence learn
heterogeneous treatment effects for single-interventions. Our main contributions are:
1. A proof that without restrictions on the causal model, single-intervention effects cannot be
identified from observations and joint-interventions. (§3.1,3.2)
2.
Proofs that single-interventions can be identified from observations and joint-interventions
when the causal model belongs to certain (but not all) classes of non-linear continuous
structural causal models with additive, multivariate Gaussian noise. (§3.2,3.3)
3.
An algorithm that learns the parameters of the proposed causal model and disentangles
single interventions from joint interventions. (§4)
4. An empirical validation of our method on both synthetic and real data.3(§5)
2 Related Work
Disentangling multiple concurrent interventions:
[
Parbhoo et al., 2021
] study the question of dis-
entangling multiple, simultaneously applied interventions from observational data. They propose a
specially designed neural network for the problem and show good empirical performance on some
datasets. However, they do not address the formal identification problem, nor do they address possible
presence of unobserved confounders. By contrast our work derives the conditions under which identi-
fiability holds. We moreover propose an algorithm that can disentangle multiple interventions even
in the presence of unobserved confounders—as long as both observational and interventional data
is available. Related work by [
Parbhoo et al., 2020
] investigated the intervention-disentanglement
problem from a reinforcement learning perspective, where each intervention combination constitutes
a different action that a reinforcement learning agent can take. Unlike this approach, our work
explicitly focuses on modelling the interactions between interventions to learn their individual ef-
fects. Closer to our work is [
Saengkyongam and Silva, 2020
], who investigate identifiability of joint
effects from observations and single-intervention data. They prove this is not generally possible, but
provide an identification proof for non-linear causal models with additive Gaussian noise. Our work
addresses a complementary question; we want to learn the effect of a single-intervention from obser-
vational data and sets of interventions. Additionally, another difference between our work and that of
[
Saengkyongam and Silva, 2020
] is that they do not consider identification of individual-level causal
effects given observed covariates. In a precursor to the work by [
Saengkyongam and Silva, 2020
],
[
Nandy et al., 2017
] developed a method to estimate the effect of joint interventions from obser-
vational data when the causal structure is unknown. This approach assumed linear causal mod-
els with Gaussian noise, and only proved identifiability in this case under a sparsity constraint.
However, like [
Saengkyongam and Silva, 2020
], our result does not need the linearity assump-
tion, and no sparsity constraints are required in our identification proof. Finally, others includ-
ing [
Schwab et al., 2020
,
Egami and Imai, 2018
,
Lopez and Gutman, 2017
,
Ghassami et al., 2021
]
explored how to estimate causal effects of a single categorical-, or continuous-valued treatment,
where different intervention values can produce different outcomes. Unlike our work, they do not
consider multiple concurrent interventions that can interact.
Combining observations and interventions:
[
Bareinboim and Pearl, 2016
] have investigated non-
parametric identifiability of causal effects using both observational and interventional data, in a
paradigm they call “data fusion.” More general results were studied by [
Lee et al., 2020
], who
provided necessary and sufficient graphical conditions for identifying causal effects from arbitrary
combinations of observations and interventions. Recent work in [
Correa et al., 2021
] explored
identification of counterfactual—as opposed to interventional—distributions from combinations of
observational and interventional data. Finally, [
Ilse et al., 2021
] investigated the most efficient way
to combine observational and interventional data to estimate causal effects. They demonstrated they
could significantly reduce the number of interventional samples required to achieve a certain fit when
3The code to reproduce our results is available at github.com/olivierjeunen/disentangling-neurips-2022.
2