Data-Adaptive Symmetric CUSUM for Sequential Change Detection

2025-08-18 4 0 2.96MB 24 页 10玖币

侵权投诉

Data-Adaptive Symmetric CUSUM for

Sequential Change Detection

Nauman Ahad, Mark A. Davenport, and Yao Xie

November 1, 2022

Abstract

Detecting change points sequentially in a streaming setting, especially when both the mean and the

variance of the signal can change, is often a challenging task. A key diﬃculty in this context often

involves setting an appropriate detection threshold, which for many standard change statistics may need

to be tuned depending on the pre-change and post-change distributions. This presents a challenge in a

sequential change detection setting when a signal switches between multiple distributions. For example,

consider a signal where change points are indicated by increases/decreases in the mean and variance of

the signal. In this context, we would like to be able to compare our change statistic to a ﬁxed threshold

that will be symmetric to either increases or decreases in the mean and variance. Unfortunately, change

point detection schemes that use the log-likelihood ratio, such as CUSUM and GLR, are quick to react to

changes but are not symmetric when both the mean and the variance of the signal change. This makes it

diﬃcult to set a single threshold to detect multiple change points sequentially in a streaming setting. We

propose a modiﬁed version of CUSUM that we call Data-Adaptive Symmetric CUSUM (DAS-CUSUM).

The DAS-CUSUM change point detection procedure is symmetric for changes between distributions,

making it suitable to set a single threshold to detect multiple change points sequentially in a streaming

setting. We provide results that relate to the expected detection delay and average run length for our

proposed procedure. Extensive simulations are used to validate these results. Experiments on real-world

data further show the utility of using DAS-CUSUM over both CUSUM and GLR.

1 Introduction

For a sequence of observations x1, . . . , xt, the goal of change point detection is to detect whether there exists

an instance ncsuch that x1, . . . , xnc−1are generated according to a diﬀerent distribution than xnc, . . . , xt,

and if so, estimating nc. This is typically accomplished by computing a simple change statistic based on

the log-likelihood ratio, which can be compared to a threshold to detect changes or optimized to estimate

nc. Sequential change point detection involves sequentially detecting multiple changes in streaming data.

Many real-world world applications require sequential detection of change points within streaming signals.

Healthcare, communication, and ﬁnance are just a few areas where sequential change detection is widely

used [27,16,1]. An extended discussion of applications of change point detection can be found in [3].

Despite being devised more than half a century ago, the CUSUM statistic is still one of the most popular

methods for detecting change points [20]. This is chieﬂy due to two reasons. First, it has a simple recursive

implementation which makes it computationally eﬃcient to apply. Second, it has been shown to be optimal

in minimizing the detection delay for a given false alarm rate [18]. However, computing the CUSUM statistic

requires complete knowledge of both the pre-change and post-change distributions. This is not feasible in

many real-world scenarios where the post-change distribution can be unknown. In such settings, a more

N.Ahad and M.A.Davenport are with the School of Electrical and Computer Engineering, Georgia Tech, Atlanta, GA, 30302,

USA. Y.Xie is with the School of Industrial and Systems Engineering, Georgia Tech, Atlanta, GA, 30302, USA. The work of N.

Ahad and M. Davenport was supported, in part, by NSF grants CCF-2107455 and DMS-2134037, NIH grant R01AG056255,

and gifts from the Alfred P. Sloan Foundation and Coulter Foundation. The work of Y. Xie was supported, in part, by an NSF

CAREER grant CCF-1650913, and NSF grants DMS-2134037, CMMI-2015787, DMS-1938106, and DMS-1830210.

E-mails: nahad3@gatech.edu, mdav@gatech.edu, yao.xie@isye.gatech.edu

arXiv:2210.17353v1 [stat.ME] 31 Oct 2022

common approach is to use the GLR statistic, which involves estimating the post-change distribution for all

possible change points [21]. Both the CUSUM and GLR statistics leverage the log-likelihood ratio for the

known/estimated pre- and post-change distributions.

Most work on change point detection has focused on identifying a single change point in the quickest

possible manner. Though this has been useful for some applications, especially those that monitor a process

for abnormal behavior such as machine fault detection and network intrusion detection, many modern appli-

cations require the detection of multiple change points sequentially in streaming data. In sequential change

point detection, the detection procedure must be restarted and continued after each change point is detected,

resulting in multiple change points being detected. Examples of such settings include segmentation of signals

for activity recognition where change points are used to identify transitions from one activity to another in a

streaming setting [4]. In such settings, the pre-change and post-change distributions themselves change after

each change point and cannot be assumed to be known a priori. This presents a signiﬁcant challenge to most

standard change detection approaches because the detection threshold must be set without any knowledge

of these distributions (with the threshold typically being ﬁxed in advance and held constant throughout the

procedure).

The machine learning community has been addressing this problem of identifying multiple change points

in data streams [17]. Such works show that procedures employed to detect change points should be symmetric.

This means the magnitude of a change from a distribution θ0to a distribution θ1should be the same for a

change from θ1to θ0. Using a procedure that has a similar power in detecting such changes makes it easy to

select a threshold for detecting multiple changes sequentially. Statistics such as the GLR and CUSUM are

not symmetric when distribution changes involve a change in variance. This makes it diﬃcult to use these

in detecting multiple changes.

In this work, we present an adaptive symmetric version of CUSUM that we call Data-Adaptive Symmetric

CUSUM (DAS-CUSUM). DAS-CUSUM uses a window to estimate the post-change distribution and employs

a symmetric change statistic to make it easier to select a ﬁxed threshold to detect multiple change points in

streaming data. We provide theoretical results for our proposed method that relate the expected detection

delay (EDD) (average delay in detecting true changes) to the average run length (ARL) (average time until

a false alarm occurs).

The rest of the paper is organized as follows. After reviewing related literature in Section 2, we formalize

the change detection problem in Section 3and further motivate the need to have a symmetric change statistic

for detecting multiple changes. Section 4provides a description of the proposed procedure. Theoretical

results that relate EDD versus ARL are described in Section 5, where a sketch of the related proofs is also

given. Section 6contains simulations that empirically validate the theoretical results in a practical setting.

Experiments on real-world data are summarized in Section 7.

2 Related work

The CUSUM statistic is known for being asymptotically optimal in minimizing the maximum average de-

tection delay as the average time to false alarm reaches inﬁnity [18]. CUSUM was later shown to be optimal

in minimizing the expected detection delay for a provided (non-asymptotic) expected time to false alarm

[19]. There has been extensive work done to further investigate and generalize the optimality property of

CUSUM. These results, however hold when both pre-change and post-change distribution are completely

known. A summary of such work can be found in [23]. A two-sided CUSUM test can be used to detect either

an increase or decrease in mean [12], but this approach still assumes a ﬁxed and known variance. When the

post-change distribution is unknown, the generalized log-likelihood ratio test (GLR) can be used by esti-

mating both the change location and the post-change distribution through maximum likelihood estimation.

However, CUSUM, GLR, and their variants are often used to detect only a single change point [22]. The few

works that do use these methods to detect multiple changes do so by only detecting changes in the mean of

normally distributed data [8,11]. It is more challenging to detect multiple changes when both the mean and

the variance of a signal change. There is limited prior work that detects joint changes in both the mean and

the variance of the signal [14], however, this has not been considered in the context of detecting multiple

changes.

Recently, there has been increasing interest in the machine learning community to detect multiple change

points sequentially within streaming data [15,17,2,10]. Most of these methods use non-parametric change

statistics, which are symmetrical. This means that the magnitude of the change statistic for a change from

θ0to θ1is equivalent in magnitude for a change from θ1to θ0. The need for this symmetrical statistic

was noted by [17], who use a symmetric KL-divergence to detect multiple changes within streaming data

where both the mean and variance of the normally distributed signal are changing. The symmetric statistic

makes it easy to set a single detection threshold before the procedure is started to detect multiple changes

within streaming data. At each time instance, a pre-change distribution is estimated using a “past window,”

and the post-change distribution is estimated using a “future window.” These methods, however, do not

incorporate data samples directly. These samples are incorporated through estimates of the distribution,

which makes these methods slow to react to changes. None of these methods characterize the relationship

between detection delay and false alarm rate.

The need to use symmetric statistics for change detection was also earlier noticed in [6,5,13], where

the authors noted the asymmetry in change statistics when there are changes in both mean and variance.

These works used a log-likelihood ratio with a drift term to make the expected value of the change statistic

symmetric under the post-change distribution. However, this drift term meant that the expected value of the

statistic is zero under the pre-change distribution, which can lead to more false positives. A slightly modiﬁed

version of this technique was mentioned in [7], where false alarm rates were reduced by adding a ﬁxed drift

term which made the expected value of the statistic negative under the pre-change distribution. However,

no details were provided about setting this drift term. These methods also provided no characterization of

the relationship between detection delay and false alarm rate.

In this work, we investigate a suitable choice for this ﬁxed drift to make the statistic symmetric under

the post-change distribution while also ensuring the expectation is negative under the pre-change distribu-

tion. Our proposed change detection procedure provides a symmetric change statistic for diﬀerent families

of probability distributions, however, the theoretical results relating detection delay and false alarm rate

consider the more restricted setting of i.i.d. univariate normally distributed data.

3 Problem statement

Change points are instances in a signal where the underlying distribution of data changes, e.g., the parameters

of the signal generating distribution change from θ0to θ1. Most change point detection methods rely on

hypothesis tests based on the log-likelihood ratio. Speciﬁcally, suppose we are given observations x0, . . . , xt

of a time series X. We will assume that each element xiis drawn independently from a distribution fθwhere

θrepresents some (possibly changing) parameters. To detect a change we compare the null hypothesis (H0)

that all xiare drawn according to fθ0for some (known) θ0to the alternate hypothesis (H1) that the time

series distribution changes from fθ0to fθ1, at time nc, for some θ16=θ0.

The likelihood of Xunder these two hypotheses is given by:

L(H0|X) =

i=1

fθ0(xi)

L(H1|X) =

nc−1

i=1

fθ0(xi)

i=nc

fθ1(xi).

By computing the likelihood ratio and taking the logarithm, we obtain the likelihood-ratio statistic at instance

tfor a change at nc:

nc=

i=nc

log fθ1(xi)

fθ0(xi).

Since the location of the change point ncis unknown, the maximum over all possible change point

locations is taken to compute the change statistic at instance t:

`t= max

1<nc<t `t

nc.(1)

A change point is detected the ﬁrst time the change statistic `tis greater than a speciﬁed threshold b. For

a sequence of i.i.d. random variables, the sum of the log-likelihood probability ratio between distributions

θ1and θ0satisﬁes an intuitive property: if a sequence is generated through a post-change distribution,

the expected value of this sum should be positive. If the sequence is generated through the pre-change

distribution, this sum should be negative. Concretely speaking, if we represent his log-likelihood ratio as

i=1

log fθ1(xi)

fθ0(xi),

then Eθ1(`t

1)>0 and Eθ0(`t

1)<0. In (1), we are maximizing over ncto ﬁnd the maximum log-likelihood ratio.

Instead of maximizing (1) with respect to nc, we can also maximize the log-likelihood ratio by minimizing,

over nc, the expression:

`t=

i=1

log fθ1(xi)

fθ0(xi)−min

1<nc<t

i=1

log fθ1(xi)

fθ0(xi).(2)

The CUSUM statistic [20] provides a computationally attractive recursive implementation of the test

in (2). It assumes both pre-change parameters θ0and post-change θ1distribution parameters are known. In

such a setting, a recursive implementation of (2) can be obtained as shown below in (3) :

St=St−1+ log fθ1(xt)

fθ0(xt)+

,(3)

where (St−1)+= max(0, St−1) and S0= 0. A change is detected at the ﬁrst instance, nc, where the cor-

responding change statistic Stis greater than a set threshold b. This is made concrete in the equation

below:

nc= inf(t > 0 : St> b).

The post-change distribution is often unknown in real-world settings. In such cases, the GLR [21] can be

used to obtain the change statistic `t. GLR maximizes the change statistic in (4) over both the post-change

distribution, θtat instance t, as well as the change instance nc. Let

nc:= max

θt

i=nc

log fθt(xi)

fθ0(xi).

Deﬁne

`t= max

1<nc<t `nc

t.(4)

This is done by ﬁrst choosing a possible change instance, nc, and ﬁnding the maximum likelihood estimate

(MLE) ˆ

θnc

tfor (4). This MLE estimate is used to obtain a possible change statistic, `t

nc, corresponding to a

change at nc. This is repeated for all possible change instances before t, and the maximum of these is taken

as the change statistic `tat time t.

Once the change statistic, `t, crosses the threshold b, a change is detected, and the corresponding post-

change estimate ˆ

θnc

tis used as the new pre-change estimate θ0and the sequential change point detection

procedure is repeated to detect the next change. This way multiple change points are detected. It is important

to point out that the GLR procedure is non-recursive and can be computationally expensive to run.

3.1 Asymmetry of log-likelihood ratio

The log-likelihood ratio statistic, employed by both GLR and CUSUM, is quick to react to changes but is an

asymmetric statistic for detecting joint changes in mean and variance. Figure 1illustrates this asymmetry.

This diﬀerence gets more pronounced when one of the two distributions has a much smaller variance.

Figure 2shows a real-world example where this asymmetry makes it diﬃcult for GLR to detect multiple

change points. The log-likelihood ratio for the ﬁrst change point is much larger than the log-likelihood ratio

for the second change point. This makes it diﬃcult to set a detection threshold a priori to detect multiple

change points in a streaming data setting. In the ﬁrst ﬁgure, the ﬁxed detection threshold results in missing

the second change point, which has a much smaller statistic. A reduction in the detection threshold leads to

many false change point detections, which can be seen in the second ﬁgure.

(a) N(1,1) → N (10,3) (b) N(10,3) → N (1,1)

Figure 1: Joint changes in mean variance lead to asymmetric Likelihood Ratios. In Figure 1(a), the pre-

change likelihood is in the tail, leading to a large likelihood ratio. In Figure 1(b), the post-change likelihood

is higher than it is in Figure 1(a), leading to a relatively smaller likelihood ratio

0 500 1000 1500 2000

100

150

200

250

Signal

Detected Change

True Change

0 500 1000 1500 2000

5000

10000

15000

GLR statistic

Threshold

True Change

0 500 1000 1500 2000

100

150

200

250

Signal

Detected Change

True Change

0 500 1000 1500 2000

500

1000

1500

2000

GLR statistic

Threshold

True Change

Lowering

threshold

(a) Missed change point

0 500 1000 1500 2000

100

150

200

250

Signal

Detected Change

True Change

0 500 1000 1500 2000

500

1000

1500

2000

GLR statistic

Threshold

True Change

(b) False change point

Figure 2: Joint changes in mean variance lead to asymmetric likelihood ratio. In Figure 2(a), the likelihood

ratio (in GLR) for the second change is much smaller than the likelihood for the ﬁrst change. This can lead

to a missed change point when the detection threshold is set to be large. When the detection threshold is

lowered to detect this missed change point, many false change points are detected, which can be seen in

Figure 2(b).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Data-AdaptiveSymmetricCUSUMforSequentialChangeDetectionNaumanAhad,MarkA.Davenport,andYaoXieNovember1,2022AbstractDetectingchangepointssequentiallyinastreamingsetting,especiallywhenboththemeanandthevarianceofthesignalcanchange,isoftenachallengingtask.Akeydicultyinthiscontextofteninvolvessettinganapp...

展开>> 收起<<

Data-Adaptive Symmetric CUSUM for Sequential Change Detection.pdf

共24页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Data-Adaptive Symmetric CUSUM for Sequential Change Detection

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: