Data-Adaptive Symmetric CUSUM for Sequential Change Detection

2025-08-18 3 0 2.96MB 24 页 10玖币
侵权投诉
Data-Adaptive Symmetric CUSUM for
Sequential Change Detection
Nauman Ahad, Mark A. Davenport, and Yao Xie
November 1, 2022
Abstract
Detecting change points sequentially in a streaming setting, especially when both the mean and the
variance of the signal can change, is often a challenging task. A key difficulty in this context often
involves setting an appropriate detection threshold, which for many standard change statistics may need
to be tuned depending on the pre-change and post-change distributions. This presents a challenge in a
sequential change detection setting when a signal switches between multiple distributions. For example,
consider a signal where change points are indicated by increases/decreases in the mean and variance of
the signal. In this context, we would like to be able to compare our change statistic to a fixed threshold
that will be symmetric to either increases or decreases in the mean and variance. Unfortunately, change
point detection schemes that use the log-likelihood ratio, such as CUSUM and GLR, are quick to react to
changes but are not symmetric when both the mean and the variance of the signal change. This makes it
difficult to set a single threshold to detect multiple change points sequentially in a streaming setting. We
propose a modified version of CUSUM that we call Data-Adaptive Symmetric CUSUM (DAS-CUSUM).
The DAS-CUSUM change point detection procedure is symmetric for changes between distributions,
making it suitable to set a single threshold to detect multiple change points sequentially in a streaming
setting. We provide results that relate to the expected detection delay and average run length for our
proposed procedure. Extensive simulations are used to validate these results. Experiments on real-world
data further show the utility of using DAS-CUSUM over both CUSUM and GLR.
1 Introduction
For a sequence of observations x1, . . . , xt, the goal of change point detection is to detect whether there exists
an instance ncsuch that x1, . . . , xnc1are generated according to a different distribution than xnc, . . . , xt,
and if so, estimating nc. This is typically accomplished by computing a simple change statistic based on
the log-likelihood ratio, which can be compared to a threshold to detect changes or optimized to estimate
nc. Sequential change point detection involves sequentially detecting multiple changes in streaming data.
Many real-world world applications require sequential detection of change points within streaming signals.
Healthcare, communication, and finance are just a few areas where sequential change detection is widely
used [27,16,1]. An extended discussion of applications of change point detection can be found in [3].
Despite being devised more than half a century ago, the CUSUM statistic is still one of the most popular
methods for detecting change points [20]. This is chiefly due to two reasons. First, it has a simple recursive
implementation which makes it computationally efficient to apply. Second, it has been shown to be optimal
in minimizing the detection delay for a given false alarm rate [18]. However, computing the CUSUM statistic
requires complete knowledge of both the pre-change and post-change distributions. This is not feasible in
many real-world scenarios where the post-change distribution can be unknown. In such settings, a more
N.Ahad and M.A.Davenport are with the School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, 30302,
USA. Y.Xie is with the School of Industrial and Systems Engineering, Georgia Tech, Atlanta, GA, 30302, USA. The work of N.
Ahad and M. Davenport was supported, in part, by NSF grants CCF-2107455 and DMS-2134037, NIH grant R01AG056255,
and gifts from the Alfred P. Sloan Foundation and Coulter Foundation. The work of Y. Xie was supported, in part, by an NSF
CAREER grant CCF-1650913, and NSF grants DMS-2134037, CMMI-2015787, DMS-1938106, and DMS-1830210.
E-mails: nahad3@gatech.edu, mdav@gatech.edu, yao.xie@isye.gatech.edu
1
arXiv:2210.17353v1 [stat.ME] 31 Oct 2022
common approach is to use the GLR statistic, which involves estimating the post-change distribution for all
possible change points [21]. Both the CUSUM and GLR statistics leverage the log-likelihood ratio for the
known/estimated pre- and post-change distributions.
Most work on change point detection has focused on identifying a single change point in the quickest
possible manner. Though this has been useful for some applications, especially those that monitor a process
for abnormal behavior such as machine fault detection and network intrusion detection, many modern appli-
cations require the detection of multiple change points sequentially in streaming data. In sequential change
point detection, the detection procedure must be restarted and continued after each change point is detected,
resulting in multiple change points being detected. Examples of such settings include segmentation of signals
for activity recognition where change points are used to identify transitions from one activity to another in a
streaming setting [4]. In such settings, the pre-change and post-change distributions themselves change after
each change point and cannot be assumed to be known a priori. This presents a significant challenge to most
standard change detection approaches because the detection threshold must be set without any knowledge
of these distributions (with the threshold typically being fixed in advance and held constant throughout the
procedure).
The machine learning community has been addressing this problem of identifying multiple change points
in data streams [17]. Such works show that procedures employed to detect change points should be symmetric.
This means the magnitude of a change from a distribution θ0to a distribution θ1should be the same for a
change from θ1to θ0. Using a procedure that has a similar power in detecting such changes makes it easy to
select a threshold for detecting multiple changes sequentially. Statistics such as the GLR and CUSUM are
not symmetric when distribution changes involve a change in variance. This makes it difficult to use these
in detecting multiple changes.
In this work, we present an adaptive symmetric version of CUSUM that we call Data-Adaptive Symmetric
CUSUM (DAS-CUSUM). DAS-CUSUM uses a window to estimate the post-change distribution and employs
a symmetric change statistic to make it easier to select a fixed threshold to detect multiple change points in
streaming data. We provide theoretical results for our proposed method that relate the expected detection
delay (EDD) (average delay in detecting true changes) to the average run length (ARL) (average time until
a false alarm occurs).
The rest of the paper is organized as follows. After reviewing related literature in Section 2, we formalize
the change detection problem in Section 3and further motivate the need to have a symmetric change statistic
for detecting multiple changes. Section 4provides a description of the proposed procedure. Theoretical
results that relate EDD versus ARL are described in Section 5, where a sketch of the related proofs is also
given. Section 6contains simulations that empirically validate the theoretical results in a practical setting.
Experiments on real-world data are summarized in Section 7.
2 Related work
The CUSUM statistic is known for being asymptotically optimal in minimizing the maximum average de-
tection delay as the average time to false alarm reaches infinity [18]. CUSUM was later shown to be optimal
in minimizing the expected detection delay for a provided (non-asymptotic) expected time to false alarm
[19]. There has been extensive work done to further investigate and generalize the optimality property of
CUSUM. These results, however hold when both pre-change and post-change distribution are completely
known. A summary of such work can be found in [23]. A two-sided CUSUM test can be used to detect either
an increase or decrease in mean [12], but this approach still assumes a fixed and known variance. When the
post-change distribution is unknown, the generalized log-likelihood ratio test (GLR) can be used by esti-
mating both the change location and the post-change distribution through maximum likelihood estimation.
However, CUSUM, GLR, and their variants are often used to detect only a single change point [22]. The few
works that do use these methods to detect multiple changes do so by only detecting changes in the mean of
normally distributed data [8,11]. It is more challenging to detect multiple changes when both the mean and
the variance of a signal change. There is limited prior work that detects joint changes in both the mean and
the variance of the signal [14], however, this has not been considered in the context of detecting multiple
changes.
Recently, there has been increasing interest in the machine learning community to detect multiple change
2
points sequentially within streaming data [15,17,2,10]. Most of these methods use non-parametric change
statistics, which are symmetrical. This means that the magnitude of the change statistic for a change from
θ0to θ1is equivalent in magnitude for a change from θ1to θ0. The need for this symmetrical statistic
was noted by [17], who use a symmetric KL-divergence to detect multiple changes within streaming data
where both the mean and variance of the normally distributed signal are changing. The symmetric statistic
makes it easy to set a single detection threshold before the procedure is started to detect multiple changes
within streaming data. At each time instance, a pre-change distribution is estimated using a “past window,”
and the post-change distribution is estimated using a “future window.” These methods, however, do not
incorporate data samples directly. These samples are incorporated through estimates of the distribution,
which makes these methods slow to react to changes. None of these methods characterize the relationship
between detection delay and false alarm rate.
The need to use symmetric statistics for change detection was also earlier noticed in [6,5,13], where
the authors noted the asymmetry in change statistics when there are changes in both mean and variance.
These works used a log-likelihood ratio with a drift term to make the expected value of the change statistic
symmetric under the post-change distribution. However, this drift term meant that the expected value of the
statistic is zero under the pre-change distribution, which can lead to more false positives. A slightly modified
version of this technique was mentioned in [7], where false alarm rates were reduced by adding a fixed drift
term which made the expected value of the statistic negative under the pre-change distribution. However,
no details were provided about setting this drift term. These methods also provided no characterization of
the relationship between detection delay and false alarm rate.
In this work, we investigate a suitable choice for this fixed drift to make the statistic symmetric under
the post-change distribution while also ensuring the expectation is negative under the pre-change distribu-
tion. Our proposed change detection procedure provides a symmetric change statistic for different families
of probability distributions, however, the theoretical results relating detection delay and false alarm rate
consider the more restricted setting of i.i.d. univariate normally distributed data.
3 Problem statement
Change points are instances in a signal where the underlying distribution of data changes, e.g., the parameters
of the signal generating distribution change from θ0to θ1. Most change point detection methods rely on
hypothesis tests based on the log-likelihood ratio. Specifically, suppose we are given observations x0, . . . , xt
of a time series X. We will assume that each element xiis drawn independently from a distribution fθwhere
θrepresents some (possibly changing) parameters. To detect a change we compare the null hypothesis (H0)
that all xiare drawn according to fθ0for some (known) θ0to the alternate hypothesis (H1) that the time
series distribution changes from fθ0to fθ1, at time nc, for some θ16=θ0.
The likelihood of Xunder these two hypotheses is given by:
L(H0|X) =
t
Y
i=1
fθ0(xi)
L(H1|X) =
nc1
Y
i=1
fθ0(xi)
t
Y
i=nc
fθ1(xi).
By computing the likelihood ratio and taking the logarithm, we obtain the likelihood-ratio statistic at instance
tfor a change at nc:
`t
nc=
t
X
i=nc
log fθ1(xi)
fθ0(xi).
Since the location of the change point ncis unknown, the maximum over all possible change point
locations is taken to compute the change statistic at instance t:
`t= max
1<nc<t `t
nc.(1)
3
A change point is detected the first time the change statistic `tis greater than a specified threshold b. For
a sequence of i.i.d. random variables, the sum of the log-likelihood probability ratio between distributions
θ1and θ0satisfies an intuitive property: if a sequence is generated through a post-change distribution,
the expected value of this sum should be positive. If the sequence is generated through the pre-change
distribution, this sum should be negative. Concretely speaking, if we represent his log-likelihood ratio as
`t
1=
t
X
i=1
log fθ1(xi)
fθ0(xi),
then Eθ1(`t
1)>0 and Eθ0(`t
1)<0. In (1), we are maximizing over ncto find the maximum log-likelihood ratio.
Instead of maximizing (1) with respect to nc, we can also maximize the log-likelihood ratio by minimizing,
over nc, the expression:
`t=
t
X
i=1
log fθ1(xi)
fθ0(xi)min
1<nc<t
nc
X
i=1
log fθ1(xi)
fθ0(xi).(2)
The CUSUM statistic [20] provides a computationally attractive recursive implementation of the test
in (2). It assumes both pre-change parameters θ0and post-change θ1distribution parameters are known. In
such a setting, a recursive implementation of (2) can be obtained as shown below in (3) :
St=St1+ log fθ1(xt)
fθ0(xt)+
,(3)
where (St1)+= max(0, St1) and S0= 0. A change is detected at the first instance, nc, where the cor-
responding change statistic Stis greater than a set threshold b. This is made concrete in the equation
below:
nc= inf(t > 0 : St> b).
The post-change distribution is often unknown in real-world settings. In such cases, the GLR [21] can be
used to obtain the change statistic `t. GLR maximizes the change statistic in (4) over both the post-change
distribution, θtat instance t, as well as the change instance nc. Let
`t
nc:= max
θt
t
X
i=nc
log fθt(xi)
fθ0(xi).
Define
`t= max
1<nc<t `nc
t.(4)
This is done by first choosing a possible change instance, nc, and finding the maximum likelihood estimate
(MLE) ˆ
θnc
tfor (4). This MLE estimate is used to obtain a possible change statistic, `t
nc, corresponding to a
change at nc. This is repeated for all possible change instances before t, and the maximum of these is taken
as the change statistic `tat time t.
Once the change statistic, `t, crosses the threshold b, a change is detected, and the corresponding post-
change estimate ˆ
θnc
tis used as the new pre-change estimate θ0and the sequential change point detection
procedure is repeated to detect the next change. This way multiple change points are detected. It is important
to point out that the GLR procedure is non-recursive and can be computationally expensive to run.
3.1 Asymmetry of log-likelihood ratio
The log-likelihood ratio statistic, employed by both GLR and CUSUM, is quick to react to changes but is an
asymmetric statistic for detecting joint changes in mean and variance. Figure 1illustrates this asymmetry.
This difference gets more pronounced when one of the two distributions has a much smaller variance.
Figure 2shows a real-world example where this asymmetry makes it difficult for GLR to detect multiple
change points. The log-likelihood ratio for the first change point is much larger than the log-likelihood ratio
for the second change point. This makes it difficult to set a detection threshold a priori to detect multiple
change points in a streaming data setting. In the first figure, the fixed detection threshold results in missing
the second change point, which has a much smaller statistic. A reduction in the detection threshold leads to
many false change point detections, which can be seen in the second figure.
4
(a) N(1,1) N (10,3) (b) N(10,3) N (1,1)
Figure 1: Joint changes in mean variance lead to asymmetric Likelihood Ratios. In Figure 1(a), the pre-
change likelihood is in the tail, leading to a large likelihood ratio. In Figure 1(b), the post-change likelihood
is higher than it is in Figure 1(a), leading to a relatively smaller likelihood ratio
0 500 1000 1500 2000
0
50
100
150
200
250
Signal
Signal
Detected Change
True Change
0 500 1000 1500 2000
0
5000
10000
15000
GLR statistic
GLR statistic
Threshold
True Change
0 500 1000 1500 2000
0
50
100
150
200
250
Signal
Signal
Detected Change
True Change
0 500 1000 1500 2000
0
500
1000
1500
2000
GLR statistic
GLR statistic
Threshold
True Change
Lowering
threshold
(a) Missed change point
0 500 1000 1500 2000
0
50
100
150
200
250
Signal
Signal
Detected Change
True Change
0 500 1000 1500 2000
0
500
1000
1500
2000
GLR statistic
GLR statistic
Threshold
True Change
(b) False change point
Figure 2: Joint changes in mean variance lead to asymmetric likelihood ratio. In Figure 2(a), the likelihood
ratio (in GLR) for the second change is much smaller than the likelihood for the first change. This can lead
to a missed change point when the detection threshold is set to be large. When the detection threshold is
lowered to detect this missed change point, many false change points are detected, which can be seen in
Figure 2(b).
5
摘要:

Data-AdaptiveSymmetricCUSUMforSequentialChangeDetectionNaumanAhad,MarkA.Davenport,andYaoXieNovember1,2022AbstractDetectingchangepointssequentiallyinastreamingsetting,especiallywhenboththemeanandthevarianceofthesignalcanchange,isoftenachallengingtask.Akeydicultyinthiscontextofteninvolvessettinganapp...

展开>> 收起<<
Data-Adaptive Symmetric CUSUM for Sequential Change Detection.pdf

共24页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:24 页 大小:2.96MB 格式:PDF 时间:2025-08-18

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 24
客服
关注