
we factorize unknown distribution shifts into transition distribution changes in stationary processes,
time-varying latent causal relations, and global changes in observation by constructing partitioned
latent subspaces, and propose provable conditions under which nonparametric latent causal processes
can be identified from their nonlinear invertible mixtures. We demonstrate through a number of real-
world datasets, including video and motion capture data, that time-delayed latent causal influences are
reliably identified from observed variables under stationary environments and unknown distribution
shifts. Through experiments, we show that our approach considerably outperforms existing baselines
that do not correctly leverage this modular representation of changes.
2 Related Work
Causal Discovery from Time Series
Inferring the causal structure from time-series data is critical
to many fields including machine learning [
1
], econometrics [
2
], and neuroscience [
3
]. Most existing
work focuses on estimating the temporal causal relations between observed variables. For this
task, constraint-based methods [
18
] apply the conditional independence tests to recover the causal
structures, while score-based methods [
19
,
20
] define score functions to guide a search process.
Furthermore, [
21
,
22
] propose to fuse both conditional independence tests and score-based methods.
The Granger causality [23] and its nonlinear variations [24, 25] are also widely used.
Nonlinear ICA for Time Series
Temporal structure and nonstationarities were recently used to
achieve identifiability in nonlinear ICA. Time-contrastive learning (TCL [
6
]) used the independent
sources assumption and leveraged sufficient variability in variance terms of different data segments.
Permutation-based contrastive (PCL [
7
]) proposed a learning framework which discriminates be-
tween true independent sources and permuted ones, and identifiable under the uniformly dependent
assumption. HM-NLICA [
11
] combined nonlinear ICA with a Hidden Markov Model (HMM) to au-
tomatically model nonstationarity without manual data segmentation. i-VAE [
9
] introduced VAEs to
approximate the true joint distribution over observed and auxiliary nonstationary regimes. Their work
assumes that the conditional distribution is within exponential families to achieve the identifiability
of the latent space. The most recent literature on nonlinear ICA for time-series includes LEAP [
14
]
and (i-)CITRIS [
26
,
27
]. LEAP proposed a nonparametric condition leveraging the nonstationary
noise terms. However, all latent processes are changed across contexts and the distribution changes
need to be modeled by nonstationary noise and it does not exploit the stationary nonparametric
components for identifiability. Alternatively, CITRIS proposed to use intervention target information
for identification of scalar and multidimensional latent causal factors. This approach does not suffer
from functional or distributional form constraints, but needs access to active intervention.
3 Problem Formulation
3.1 Time Series Generative Model
Stationary Model
As a
fundamental
case, we first present a regular, stationary time-series gener-
ative process where the observations
xt
comes from a nonlinear (but invertible) mixing function
g
that maps the time-delayed causally-related latent variables
zt
to
xt
. The latent variables or processes
zthave stationary, nonparametric time-delayed causal relations. Let τbe the time lag:
xt=g(zt)
| {z }
Nonlinear mixing
, zit =fi({zj,t−τ|zj,t−τ∈Pa(zit)}, it)
| {z }
Stationary nonparametric transition
with it ∼pi
| {z }
Stationary noise
.
Note that with nonparametric causal transitions, the noise term
it ∼pi
(where
pi
denotes the
distribution of
it
) and the time-delayed parents
Pa(zit)
of
zit
(i.e., the set of latent factors that
directly cause
zit
) are interacted and transformed in an arbitrarily nonlinear way to generate
zit
. Under
stationarity assumptions, the mixing function
g
, the transition functions
fi
and the noise distributions
pi
are invariant. Finally, we assume that the noise terms are mutually-independent (i.e., spatially and
temporally independent), which implies that instantaneous causal influence between latent causal
processes is not allowed by the formulation. The stationary time-series model in the fundamental
case is used to establish the identifiability results under fixed causal dynamics in Section 4.1.
Nonstationary Model
We further consider two violations of the stationarity assumptions in the
fundamental case, which lead to two nonstationary time series models. Let
u
denote the domain
3