OPTIMAL SUB-SAMPLING TO BOOST POWER OF KERNEL SEQUENTIAL CHANGE-POINT DETECTION Song Wei Chaofan Huang

2025-04-29 1 0 685.9KB 5 页 10玖币

侵权投诉

OPTIMAL SUB-SAMPLING

TO BOOST POWER OF KERNEL SEQUENTIAL CHANGE-POINT DETECTION

Song Wei Chaofan Huang

School of Industrial and Systems Engineering, Georgia Institute of Technology

ABSTRACT

We present a novel scheme to boost detection power for kernel

maximum mean discrepancy based sequential change-point de-

tection procedures. Our proposed scheme features an optimal

sub-sampling of the history data before the detection proce-

dure, in order to tackle the power loss incurred by the random

sub-sample from the enormous history data. We apply our

proposed scheme to both Scan

and Kernel Cumulative Sum

(CUSUM) procedures, and improved performance is observed

from extensive numerical experiments.

Index Terms—

Kernel methods; maximum mean discrep-

ancy (MMD); online decision making; optimal sub-sampling;

sequential change-point detection.

1. INTRODUCTION

Sequential change-point detection is a classic and fundamental

problem in signal processing, information theory and statistics,

and recently it is of great interest to machine learning and data

mining community. The goal is to raise an alarm as soon as

possible once a change in distribution occurs in streaming data,

and meanwhile to keep the frequency of false alarm as low as

possible. Classic parametric methods typically handle changes

in distribution parameters, but their success heavily relies on

exact speciﬁcation of the probability distribution. As a result,

the distribution-free non-parametric methods are much more

favorable nowadays.

Kernel maximum mean discrepancy (MMD) [1

–

4] is one

of the most popular and powerful non-parametric methods

for testing if the two samples are from the same distribution.

This can be naturally applied to the change-point detection

problem by viewing the history observations and the streaming

data as the two samples considered in the test [5]. However,

the quadratic complexity of computing the kernel MMD is

the key limitation for its practical use in the online detection

setting. Existing methods to tackle such challenge include [6],

who proposed Scan

-procedure by leveraging

-test statistic

[7] to achieve linear complexity, and [8], who proposed a

Kernel Cumulative Sum (KCUSUM) procedure by replacing

the likelihood ratio in classic CUSUM with the linear-time

MMD statistic. Both methods used a random sub-sample

instead of the complete history observations to maintain the

computation and memory efﬁciency. However, such random

sub-sample leads to a power loss due to its poor representation

of the pre-change distribution.

Instead, one would want to identify the optimal sub-

samples that are most representative of the pre-change dis-

tribution, i.e., big data, to minimize the information loss.

Kernel herding [9] and kernel thinning [10] are two optimal

sub-sampling methods based on the kernel MMD statistics,

i.e., the sub-samples are chosen such that the kernel MMD

between the sub-samples and the big data is minimized. With

minimal information loss in the sub-sample selection, one

should expect an improvement in the detection power over the

use of random sub-samples.

In this work, we propose a scheme where we perform

optimal sub-sampling, or rather, kernel thinning, on history

observations as a pre-process before the detection procedure,

in order to tackle the poor representation of the random sub-

samples. We apply our proposed scheme on both Scan

and

KCUSUM procedures. Extensive numerical experiments are

conducted to demonstrate that our proposed scheme can help

boost the detection power for kernel MMD-based sequential

change-point detection procedure.

2. SETUP

Consider sufﬁciently large amount of history data following

pre-change distribution Pon domain X(typically X=Rd):

X1, . . . , XM∼P, i.i.d.,

and streaming data

Y1, . . . , Yt

up to current time

. In online-

change point detection, the goal is to test

H0:X1, . . . , XM, Y1, . . . , Yt∼P,

H1:∃r < t, X1, . . . , XM, Y1, . . . , Yr∼P,

Yr+1, . . . , Yt∼Q,

for some unknown distribution Q6=P.

In sequential change-point detection, we aim to detect the

change-point as soon as possible subject to the false alarm

rate (i.e., type I error). The surrogate of false alarm rate is

average run length (ARL). Typically, we raise an alarm once

the detection statistic exceeds a pre-selected threshold

, which

arXiv:2210.15060v2 [stat.ME] 14 Jan 2023

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OPTIMALSUB-SAMPLINGTOBOOSTPOWEROFKERNELSEQUENTIALCHANGE-POINTDETECTIONSongWeiChaofanHuangSchoolofIndustrialandSystemsEngineering,GeorgiaInstituteofTechnologyABSTRACTWepresentanovelschemetoboostdetectionpowerforkernelmaximummeandiscrepancybasedsequentialchange-pointde-tectionprocedures.Ourproposedsch...

展开>> 收起<<

OPTIMAL SUB-SAMPLING TO BOOST POWER OF KERNEL SEQUENTIAL CHANGE-POINT DETECTION Song Wei Chaofan Huang.pdf

共5页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

OPTIMAL SUB-SAMPLING TO BOOST POWER OF KERNEL SEQUENTIAL CHANGE-POINT DETECTION Song Wei Chaofan Huang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: