Granger Causality for Predictability in Dynamic Mode Decomposition
G. Revati, Syed Shadab, K. Sonam, S. R. Wagh, and N. M. Singh
Abstract— The dynamic mode decomposition (DMD) tech-
nique extracts the dominant modes characterizing the innate
dynamical behavior of the system within the measurement data.
For appropriate identification of dominant modes from the
measurement data, the DMD algorithm necessitates ensuring
the quality of the input measurement data sequences. On that
account, for validating the usability of the dataset for the DMD
algorithm, the paper proposed two conditions: Persistence of
excitation (PE) and the Granger Causality Test (GCT). The
virtual data sequences are designed with the hankel matrix rep-
resentation such that the dimensions of the subspace spanning
the essential system modes are increased with the addition of
new state variables. The PE condition provides the lower bound
for the trajectory length, and the GCT provides the order
of the model. Satisfying the PE condition enables estimating
an approximate linear model, but the predictability with the
identified model is only assured with the temporal causation
among data searched with GCT. The proposed methodology
is validated with the application for coherency identification
(CI) in a multi-machine power system (MMPS), an essential
phenomenon in transient stability analysis. The significance of
PE condition and GCT is demonstrated through various case
studies implemented on 22 bus six generator system.
Index Terms— Coherency Identification, Dynamic Mode De-
composition (DMD), Granger causality, Hankel, Persistence of
Excitation (PE).
I. INTRODUCTION
With the growing emphasis on data-driven modeling, un-
derstanding the interactions and connections among the time
series drawn from observational data is a field of interest.
Causality is the intersection of philosophy and sciences
[1], deriving the generalizations and theories from specific
observations by analyzing the cause and effects among the
observational data. A primary approach for understanding
the information flow amongst the time series is to determine
the cross-correlation [2] among the two time series and
to discover the existence of a peak in the correlation at
some non-zero lag. The causal inferences drawn from the
correlation are misleading since the correlation reveals only
whether the two variables are statistically linked. The causal
relationship amongst two variables can be direct, or indirect
due to confounding effect [3] i.e., besides the variables under
study there is some additional unnoticed variable correlated
with the considered variables. Furthermore, the correlation
being a symmetric measure fails to provide any information
about the causality direction.
G. Revati, Syed Shadab, S. R. Wagh, and N. M. Singh are with Control
and Decision Research Centre (CDRC), EED, Veermata Jijabai Tech-
nological Institute, Mumbai 400019, India cdrc@ee.vjti.ac.in.
K. Sonam is with the Computer Science and Engineering Department,
University of South Carolina, USA
As per the principle of time asymmetry of causation, clas-
sical physics employs the precedence of causes over effects.
Accounting for the direction of causation, a dynamical model
identification is another approach. The concept of dynamical
model identification is fundamentally developed on the fact
that law drives the system and enables the evolution of the
same state in a similar manner [4] i.e., similar effects are
produced by the same causes mentioned as per physical
determination. The laws defining the system dynamics are
identified from the regression of observational data achieved
through the evaluation of correlation. For detecting and
quantifying the temporal causality amidst the time series,
a powerful statistical test known as Granger Causality (GC)
was first proposed in [5]. The widespread applications of the
GC in neuroscience [6], economy [7], and climate modelling
[8] are mentioned in the literature. The fundamental notion
behind GC is the enhancement in the prediction of one
variable with the introduction of past information of another
variable along with the past information of the considered
variable itself.
Conventionally the control applications extensively opted
for the system identification methods fitting the data to the
model parameterized priori [9]. The growing complexity and
huge amount of available system data challenged the standard
strategies for learning the dynamical system. Alternatively,
the paradigm shift occurred towards the identification of a
dynamical system from the raw measurement data of the
system. The data-driven modeling approaches are generally
dependent on searching for the accurate combination of the
known trajectory in order to achieve a reliable prediction
which is usually an ill-conditioned problem [10]. For deal-
ing with such problems, the Moore-Penrose pseudoinverse
[11] solving the least norm problems is preferred due to
computational simplicity. One such data-driven subspace
prediction strategy is the Dynamic Mode Decomposition
(DMD) [12] which decomposes the high dimensional data
into spatiotemporal coherent modes.
DMD is a dimensionality reduction technique [13] pio-
neered in the fluid dynamics community by Peter Schmid for
identifying the linear approximation from the data compris-
ing the dominant modes describing the dynamical behavior
of the system [14]. The quality identification of the dominant
modes capturing the dynamics depends on the quality of the
measurement data exploited for the strategy. For capturing
the modes of the system, the cardinality of the measurement
sequences utilized in the DMD should be greater than or
equal to the underlying system modes. Hence the dimensions
of the subspace spanning the essential dynamical modes
are increased with the Hankel matrix [15] introducing the
arXiv:2210.12737v1 [eess.SY] 23 Oct 2022