
RHINO: DEEP CAUSAL TEMPORAL RELATIONSHIP
LEARNING WITH HISTORY-DEPENDENT NOISE
Wenbo Gong, Joel Jennings, Cheng Zhang & Nick Pawlowski
Microsoft Research
Cambridge, UK
{t-gongwenbo, joeljennings, cheng.zhang, nick.pawlowski}
@microsoft.com
ABSTRACT
Discovering causal relationships between different variables from time series data
has been a long-standing challenge for many domains such as climate science,
finance and healthcare. Given the the complexity of real-world relationships and
the nature of observations in discrete time, causal discovery methods need to con-
sider non-linear relations between variables, instantaneous effects and history de-
pendent noise (the change of noise distribution due to past actions). However,
previous works do not offer a solution addressing all these problems together. In
this paper, we propose a novel causal relationship learning framework for time-
series data, called Rhino, which combines vector auto-regression, deep learning
and variational inference to model non-linear relationships with instantaneous ef-
fects while allowing the noise distribution to be modulated by historical observa-
tions. Theoretically, we prove the structural identifiability of Rhino. Our empir-
ical results from extensive synthetic experiments and two real-world benchmarks
demonstrate better discovery performance compared to relevant baselines, with
ablation studies revealing its robustness under model misspecification.
1 INTRODUCTION
Time series data is a collection of data points recorded at different timestamps describing a pattern
of chronological change. Identifying the causal relations between different variables and their in-
teractions through time (Spirtes et al., 2000; Berzuini et al., 2012; Guo et al., 2020; Peters et al.,
2017) is essential for many applications e.g. climate science, health care, etc. Randomized control
trials are the gold standard for discovering such relationships, but may be unavailable due to cost
and ethical constraints. Therefore, causal discovery with just observational data is important and
fundamental to many real-world applications (L¨
owe et al., 2022; Bussmann et al., 2021; Moraffah
et al., 2021; Wu et al., 2020; Runge, 2018; Tank et al., 2018; Hyv¨
arinen et al., 2010; Pamfil et al.,
2020).
The task of temporal causal discovery can be challenging for several reasons: (1) relations between
variables can be non-linear in the real world; (2) with a slow sampling interval, everything happens
in between will be aggregated into the same timestamp, i.e. instantaneous effect; (3) the noise may
be non-stationary (its distribution depends on the past observations), i.e. history-dependent noise.
For example, in stock markets, the announcements of some decisions from a leading company after
the market closes may have complex effects (i.e. non-linearity) on its stock price immediately after
the market opening (i.e. slow sampling interval and instantaneous effect) and its price volatility may
also be changed (i.e. history-dependent noise). Similarly, in education, students that recently earned
good marks on algebra tests should also score well on an upcoming algebra exam with little variation
(i.e. history-dependent noise).
To the best of our knowledge, existing frameworks’ performances suffer in many real-world sce-
narios as they cannot address these aspects in a satisfactory way. Especially, history-dependent
noise has been rarely considered in past. A large category of the preceding works, called Granger
causality (Granger, 1969), is based on the fact that cause-effect relationships can never go against
time. Despite many recent advances (Wu et al., 2020; Shojaie & Michailidis, 2010; Siggiridou &
1
arXiv:2210.14706v1 [cs.LG] 26 Oct 2022