a version including information about QCD to see if we
can improve QHD performance. We also test the hypothesis
that we can improve QHD test performance by including
the pre-change time-evolution in correlation with two goals:
to tackle the state-of-the-art limitation of uncorrelated pre-
change variables and to see if time-evolution in correlation
can improve QHD and QHD+QCD performances.
II. RELATED LITERATURE
In S. Aminikhanghahi and D. Cook [8], we can see there
are multiple methods and approaches for Time Series CPD.
Authors mention supervised methods, such as Decision Trees,
Naive Bayes, Bayesian Net, SVM, Nearest Neighbor, Hidden
Markov Models, and Gaussian Mixture Models. Another group
includes unsupervised methods, like Likelihood Ratio Meth-
ods(LR), Subspace Models Methods, Probabilistic Methods,
Kernel-Based Methods, Graph-Based Methods, and Clustering
Methods. Supervised, Probabilistic, and Clustering methods do
not require extra data apart from the fitting window. Whereas
LR, Subspace model, Kernel-based methods require to include
post-change data [8]. All these methods are focused on uni-
variate time series, but our focus is on CPD in correlation
structures.
From CPD literature, sequential analysis solutions are the
most suited for our problem. QCD tries to identify the closest
change point in a sequential setting. For detection to be quick,
we need high-dimensional settings with p >> n, with n
timestamps and p variables. In A. G. Tartakovsky [9] we find a
detailed description of sequential multi-decision for CPD and
QCD. Changes in statistical properties of distributions from
previously identical populations distributions are monitored
to detect CPD, subject to lower levels of false alarms and
delays. Methodologies rely on two standardized methods and
their variants, the Page Cumulative SUM (CUSUM), and
the Shiryaev-Roberts. Both had been introduced by the same
author in A. G. Tartakovsky [9]. The problem with this setup is
that [4], stream independence is assumed but we want to detect
changes in the level of streams’ dependence, and the setting is
not high-dimensional. Alternatives that can tackle dependence
in the sequential analysis literature are based on a sub-optimal
test, which provides a performance analysis of the test which
is then used to design the test by choosing thresholds. The
efficiency of these tests is verified by simulations [4]. However,
these solutions are nonoptimal, not high-dimensional, and not
focused on correlation structures. To cope with these three
aspects, we need to focus on work developed by T. Banerjee
and A. Hero [5], T. Banerjee et al. [4], and A. Hero and B.
Rajaratnam [6], [7].
As a preamble, V. Veeravalli and T.Banerjee [10] focused
first on CPD in a parametric setting, on Bayesian CPD trying
to minimize Average Detection Delay (ADD) subject to a
constraint in the Probability of False Alarm (PFA). They
also focused on a second solution, Minimax CPD, in which
Lorden´s test from G. Lorden et al. [11] is applied. Lorden
developed the first minimax theory for delays in CPD, ”in
which he proposed a measure of detection delay obtained
by taking the supremum (over all possible change points)
of a worst-case delay over all possible realizations of the
observations, conditioned on the change point” [10]. Lorden’s
test is important to understand their posterior work.
In [6], a discovery is a correlation above a threshold, they
derive an asymptotic expression for the mean number of
discoveries that is a function of the number of samples. It is
shown that the mean number of discoveries is influenced by the
population covariance matrix by the Bhattacharyya measure
of the average pairwise dependency of the p-multivariate U-
scores defined on the (n-2)-dimensional hypersphere [6]. Un-
der weak dependency assumptions, the number of discoveries
is asymptotically represented by a Poisson distribution. For
auto-correlation and cross-correlation discoveries this Poisson
distribution is measured by the number of positive vertex
degrees in the associated sample correlation graph [6]. In [7],
they focus on hub discovery in partial correlation graphs, and
an extension for variables with a specific degree of connectiv-
ity. A hub is defined broadly as any variable that is correlated
with at least δother variables having a magnitude correlation
exceeding ρ. Their setup is the first high-dimensional of its
kind. They show that the count Nδ,ρpof the number of groups
of δmutually coincident edges in the correlation graph (and
partial) with correlation threshold ρconverges to a Poisson
variable:
P(Nδ,ρp>0) →exp(−Λ/ϕ(δ)) (1)
This had implications for future work developed in [4], [5].
In [5], authors introduce a nonparametric QCD test for large-
scale random matrices based on a global summary statistic
from asymptotic properties of RMT. It is assumed pre- and
post- change distributions of the i.i.d random matrices rows
are unknown or belong to an elliptically contoured family [5].
If pre- and post- change densities f0
Xand f1
Xare known,
and the mean µmis constant before and after the change,
algorithms such as Cumulative Sum (CumSum) or Shiryaev-
Roberts (SR) can be efficiently used as both have optimal
properties respect to Lorden formulations. In this case, it
is a parametric CPD problem, and asymptotically optimally
solved by Generalized Likelihood Ratio (GLR) tests. For [5],
pre- and post- change densities are unknown and present an
optimal nonparametric solution, as an asymptotically optimal
solution to the minimax CPD in the random matrix setup using
large-scale RMT. The framework in [5] is suited for high-
dimensional settings. Therefore, a summary to justify why we
focus on nonparametric methods with RMT like [4], [5]:
•Pre- and post- change densities f0
Xand f1
Xare NOT
known, and µmis NOT constant before and after the
change.
•Need to focus on high-dimensional settings p >> n for
QCD and QHD in correlation structures.
•Other CPD/QHD methods are rejected too because cannot
tackle dependence.