
Measurement error adversely impairs causal discovery [
28
,
24
,
36
]. The measuring process can be
viewed as directed edges from underlying variables of interest (unobservable) to measured values
(observable), and the d-separation patterns on underlying variables typically do no hold on measured
ones. Consider the causal effects from factory emissions to air quality then to residents’ lung
health , as shown in Figure 1, while we only have corresponding measured quantities: chimney
statistics , PM
2.5
, and hospital reported cases . Though and are independent given ,
and are however dependent given PM
2.5
. If measurement error is severe, and even tend to
be marginally independent [
53
,
36
], which makes PM
2.5
look like a collider (common child). One
might thus incorrectly infer that, lung cancer causes air pollution. In fact, such measurement error is
always a serious threat in environmental epidemiologic studies [30, 10].
Denote by
˜
X={˜
Xi}n
i=1
the latent measurement-error-free variables and
X={Xi}n
i=1
observed
ones. While there are different models for measuring process [
12
,
6
,
49
], in this paper, we consider
the random measurement error model [
36
], where the observed variables are generated from the
latent measurement-error-free variables
˜
Xi
with additive random measurement errors
E={Ei}n
i=1
:
Xi=˜
Xi+Ei.(1)
Measurement errors
E
are assumed to be mutually independent and independent of
˜
X
. We
assume causal sufficiency relative to
˜
X
(i.e., no confounder of
˜
X
who does not have a respective
measurement), and focus on the case where
˜
X
is generated by a linear, non-Gaussian, acyclic
model (LiNGAM [
38
], see §3.1). Note that here w.l.o.g., the linear weights of
{˜
Xi→Xi}n
i=1
are
assumed to be one (since we do not care about scaling). Generally, if observations are measured by
Xi=ci˜
Xi+Eiwith weights {ci}n
i=1 not necessarily being one, all results in this paper still hold.
The objective of causal discovery under measurement error is to recover causal structure among
latent variables
˜
X
, denoted as
˜
G
, a directed acyclic graph (DAG), from contaminated observations
X
. As illustrated by Figure 1, causal discovery methods that utilize (conditional) independence
produce biased estimation (see Proposition 1 for details). SEM-based methods also typically fail to
find correct directions, since the SEM for
˜
X
usually do not hold on
X
. Unobserved
˜
X
are actually
confounders of
X
, and there exists approaches to deal with confounders, such as Fast Causal Inference
(FCI [
44
]). However, they focus on structure among observed variables instead of the unobserved
ones, which is what we aim to recover here. With the interest for the latter, another line of research
called causal discovery with latent variables is developed, which this paper is also categorized
to. However, existing methods [
41
,
42
,
47
,
46
,
1
,
23
,
50
,
35
] cannot be adopted either, since they
typically require at least two measurements (indicators) for each latent variable, while we only have
one for each here (and is thus a more difficult task). Specifically on the measurement error problem,
[
16
] proposes anchored causal inference in the binary setting. In the linear Gaussian setting, [
53
]
presents identifiability conditions by factor analysis. A main difficulty here is the unknown variances
of the measurement errors
E
, otherwise the covariance matrix of
˜
X
can be obtained and readily used.
To this end, [
2
] provides an upper-bound of
E
and [
34
] develops a consistent partial correlations
estimator. In linear non-Gaussian settings (i.e., the setting of this paper), [
54
] shows that the ordered
group decomposition of
˜
G
, which contains major causal information, is identifiable. However, the
corresponding method relies on over-complete independent component analysis (OICA [
19
]), which
is notorious for suffering from local optimal and high computational complexity [
39
,
18
]. Hence, the
identifiability results in [54], despite the theoretical correctness, is far from practical achievability.
The main contributions of this paper are as follows:
1)
We define the
T
ransformed
I
ndependent
N
oise
(
TIN
) condition, which finds and checks for independence between a specific linear transformation
(combination) of some variables and others. The existing Independent Noise (
IN
[
40
]) and General-
ized Independent Noise (
GIN
[
50
]) conditions are special cases of
TIN
.
2)
We provide graphical
criteria of
TIN
, which might further improve identifiability of causal discovery with latent variables.
3)
We exploit
TIN
on a specific task, causal discovery under measurement error and LiNGAM, and
identify the ordered group decomposition. This identifiability result once required computationally
and statistically ineffective OICA to achieve, while we achieve it merely by conducting independence
tests. Evaluation on both synthetic and real-world data demonstrate the effectiveness of our method.
2 Motivation: Independence Condition and Structural Information
The example in Figure 1 illustrates how the (conditional) (in)dependence relations differ between
observed Xand latent ˜
X, and thus lead to biased discovery results. To put it generally, we have,
2