Resonant anomaly detection without background sculpting Anna Hallin1Gregor Kasieczka2 3Tobias Quadfasel2David Shih1and Manuel Sommerhalder2 1NHETC Dept. of Physics and Astronomy Rutgers University Piscataway NJ 08854 USA

2025-04-29 1 0 1.3MB 11 页 10玖币

侵权投诉

Resonant anomaly detection without background sculpting

Anna Hallin,1, ∗Gregor Kasieczka,2, 3, †Tobias Quadfasel,2, ‡David Shih,1, §and Manuel Sommerhalder2, ¶

1NHETC, Dept. of Physics and Astronomy, Rutgers University, Piscataway, NJ 08854, USA

2Institut f¨ur Experimentalphysik, Universit¨at Hamburg, 22761 Hamburg, Germany

3Center for Data and Computing in Natural Sciences (CDCS), 22607 Hamburg, Germany

We introduce a new technique named Latent CATHODE (LaCathode) for performing “enhanced

bump hunts”, a type of resonant anomaly search that combines conventional one-dimensional bump

hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could

also be localized. The main advantage of LaCathode over existing methods is that it provides an

anomaly score that is well behaved when evaluating it beyond the signal region, which is essential to

prevent the sculpting of background distributions in the bump hunt. LaCathode accomplishes this

by constructing the anomaly score directly in the latent space learned by a conditional normalizing

ﬂow trained on sideband regions. We demonstrate the superior stability and comparable performance

of LaCathode for enhanced bump hunting in an illustrative toy example as well as on the LHC

Olympics R&D dataset.

I. INTRODUCTION

Despite countless searches for new physics at the LHC,

so far no evidence for physics beyond the Standard Model

was found. The vast majority of these searches are model

speciﬁc, motivated by and optimized for particular sce-

narios and particle spectra. Recently there has been

much interest in the possibility that new physics could

be present in the data but we simply have not searched

in the right places yet. This has led to an enormous

activity in developing new methods for model-agnostic

searches at the LHC (see e.g. [1,2] for recent community

overviews of anomaly detection and [3] for a more general

overview of machine learning methods to search for new

physics).

One promising class of approaches can be referred to as

“enhanced bump hunts”, where the idea is to upgrade a

standard one-dimensional bump hunt, e.g. in an invariant

mass m1, to a multivariate setting. This is achieved by

including an anomaly score R(x) learned from auxiliary

features x∈Rdwhere the signal may also be localized,

but in an a priori unknown way.

In general, enhanced bump hunts follow these steps:

i) Designate nonoverlapping signal region (SR) and

sidebands (SB) in m.

ii) Derive an anomaly score R(x) and select events that

pass a threshold value R(x)> Rc.

iii) Fit a suitable (e.g. falling spectrum) background-

only function to the selected events in the SB.

∗anna.hallin@rutgers.edu

†gregor.kasieczka@uni-hamburg.de

‡tobias.quadfasel@uni-hamburg.de

§shih@physics.rutgers.edu

¶manuel.sommerhalder@uni-hamburg.de

1We will use mfor illustration in this text, but all features in

which the signal is resonant and the background is smooth can be

used [4].

iv) Compare the background-only prediction from step

iii) to data in the SR and derive limits or claim dis-

covery.

Methods for enhanced bump hunts include those con-

structed using autoencoders [5,6] or based on weak su-

pervision [7,8]. While weak supervision allows the con-

struction (in an ideal case) of a provably optimal anomaly

score—see Section II A for details—correlations between

the bump hunt feature and auxiliary features can spoil

these methods. This observation has motivated the de-

velopment of a number of new techniques that aim to

improve the sensitivity and stability of anomaly detec-

tion in the presence of correlations [9–13]. In particular,

the recently proposed Cathode [12] and Curtains [13]

techniques have been demonstrated to achieve close-to-

optimal signal sensitivity, even in the presence of corre-

lations between features.

This paper is concerned with another issue that has

received less attention but still might spoil the practi-

cal application of enhanced bump hunts: background

sculpting. The enhanced bump hunting procedure out-

lined above can only work if the cut introduced in ii)

does not sculpt the background (i.e. introduce artiﬁcial

bumps in the background-only mspectrum). Alas, state-

of-the-art protocols like Cathode and Curtains have no

built-in measures to prevent such sculpting. Even worse,

the anomaly score of these approaches is only derived for

the SR, leading to potentially unpredictable extrapola-

tion behavior elsewhere.

The scope of this paper is to clearly identify this sculpt-

ing issue and to provide a viable solution. In Sec. II, we

ﬁrst discuss enhanced bump hunt strategies and then in-

troduce the novel LaCathode approach. Section III uses

an analytic toy model to illustrate the problem and shows

that correlations between mand the auxiliary features

are the root cause of background sculpting. It also shows

that LaCathode indeed successfully mitigates this issue.

Section IV reiterates these points, but in the context

of the more physically motivated LHC Olympics R&D

dataset [14]. Section Vconcludes this work.

arXiv:2210.14924v2 [hep-ph] 10 Jul 2023

II. METHOD

A. Existing strategies for enhanced bump hunts

According to the Neyman-Pearson lemma [15], the

provably optimal anomaly score for any model-agnostic

search would be:

R(x) = pdata(x)

pbg(x)(1)

where pdata(x) and pbg(x) are the probability densities

of the data and the background respectively. Of course,

in practice we never have access to this likelihood ratio,

since the probability densities of data and background are

in general intractable. At best, one could hope for a large

number of samples drawn from the data and true back-

ground distributions; then one could approximate R(x)

with a classiﬁer trained on these samples. We will refer

to this approximation of (1) as the “idealized anomaly

detector” throughout.

Since it is generally not possible to draw samples from

the true pbg(x) in a realistic anomaly search scenario, we

can at best approximate this idealized case either with

simulations or in a data-driven way. The focus here will

be on the latter strategy.

The challenge then is to obtain a high-quality estimate

for pbg(x) from data, e.g. by interpolating from sidebands

(SB) in minto a signal region (SR), and use weak super-

vision to obtain an anomaly score R(x). As long as a

cut on R(x)> Rcdoes not sculpt the mdistribution,

one can combine this cut with the 1D bump hunt in m

to greatly enhance the signiﬁcance of the signal over the

background.

In the original enhanced bump hunt method, called

CWoLa-Hunting [8], R(x) comes from a SR vs SB clas-

siﬁer. This works as long as the features xand mare

statistically independent in the background (i.e. the x

features are distributed identically in the SR and the SB

for the background). This also ensures that R(x)> Rc

will not sculpt the mdistribution. Using these proper-

ties, the full enhanced bump hunt search strategy using

CWoLa-Hunting was successfully demonstrated on toy

simulation data [8,16], and then implemented on actual

data by the ATLAS Collaboration in [17].

However, it can be challenging to ensure that xand m

are independent in the background. Even a small corre-

lation can degrade or destroy the sensitivity of CWoLa

Hunting to anomalies. This has motivated the develop-

ment of alternative approaches that are more robust to

correlations.

•In Anode [9], one learns pdata(x) and pbg (x) using

conditional density estimators trained on the data

with m∈SR and with m∈SB; the latter are

automatically interpolated in minto the SR, which

alleviates the problem with correlations between x

and m. It was shown in [12] that in the presence of

correlations between xand m, the signal sensitivity

of Anode is robust while that of CWoLa-Hunting

collapses.

•In Cathode [12], one learns pbg (x) using the SB

density estimator just as in Anode. However, in-

stead of the second SR density estimator (which

will be more diﬃcult to learn as it must also capture

the tiny deviations from the smooth pbg(x) from

a small localized signal), one samples from pbg(x)

in the SR, and trains a classiﬁer (as in CWoLa-

Hunting) between the data and the synthetic back-

ground samples. Cathode thereby captures the

best of both Anode and CWoLa-Hunting, achiev-

ing a signal sensitivity that is nearly optimal and

yet robust to correlations between xand m.

•Finally, the Curtains [13] protocol operates similar

to Cathode, with the main diﬀerence that condi-

tional invertible neural networks (cINNs) are used

to map background examples from the SB into the

SR.

B. The problem of background sculpting

So far, apart from CWoLa-Hunting, the majority of

the eﬀort has been invested in exploring data-driven ap-

proaches to learn R(x) as accurately as possible from

sidebands, while much less attention has been paid to

the issue of background sculpting. However, signal sen-

sitivity is not the only component of a successful new

physics search; background estimation is also essential.

In the presence of correlations between xand min the

background events, one must also show that R(x), even

if ideal, does not sculpt the background mdistribution

around the signal region, which would prevent back-

ground estimation via the 1D bump hunt. See Fig. 1

for an illustration of such correlated input features.

Note that, in any complete enhanced bump hunt strat-

egy, two data-driven background estimations must take

place:

1. An interpolation of the learned pbg (x) from SB to

SR in order to construct R(x).

2. After cutting on R(x)> Rc, we proceed with the

usual 1D bump hunt: an interpolation in the m

distribution from SB to SR (e.g. by ﬁtting a suitable

functional form to the data excluding the SR).

This work is concerned with ensuring the robustness

of the second estimation. We will demonstrate—using

both a simple analytic toy model and examples drawn

from the LHC Olympics 2020 R&D dataset [14]—that

in the presence of correlations between xand m, cutting

on the learned R(x) can result in signiﬁcant sculpting of

the mdistribution. This can be understood by the fact

that R(x) must be a more-or-less smooth function of x,

so any correlations of mwith xwill be inherited by R(x).

Furthermore, R(x) was learned using events in the SR, so

Signal Region

Lower Sideband Upper Sideband

log(counts)

FIG. 1. Illustration of the correlation of (hypothetical) input

features x0and x1with min the background. This ﬁgure

describes the situation mentioned in the text, where one can

clearly observe that the background distributions of input fea-

tures x0and x1change dramatically from the lower sideband

(low m) to the upper sideband (high m) and thus are strongly

correlated with m.

it has to be extrapolated from SR to SB in order to apply

the threshold R(x)> Rceverywhere. This extrapolation

could lead to unpredictable eﬀects, including sculpting,

especially in the presence of strong correlations between

xand m.

C. LaCATHODE to the rescue

After identifying the issue that leads to sculpting of the

background mdistribution, we present a solution to the

problem, which is outlined in Fig. 2and described in the

following. We call our new approach Latent CATHODE

or LaCathode for short, because it is closely derived

from the Cathode method.

The solution actually lies at the heart of the Cathode

method: the SB density estimator is a conditional nor-

malizing ﬂow, which is an invertible map ffrom data

space xto a latent space z, for every value of m:

z=f(x;m) (2)

The background events in the latent space zare supposed

to follow a simple prespeciﬁed base distribution, which we

take to be the unit normal distribution N(µ= 0, σ = 1)d

for concreteness.

The idea of LaCathode is to train the classiﬁer be-

tween SR data and SR background in the latent space z

instead of in the physical feature space x. Since fmaps x

to the same latent space for every value of m, working in

the latent space has the eﬀect of decorrelating the data

from min the background, which should eliminate the

problem of sculpting. In other words, since the zspace is

always the same for every m, no extrapolation is needed

to evaluate R(z) outside the SR where it was learned.2

Furthermore, since fis invertible, and likelihood ratios

are invariant under coordinate reparametrizations, the z-

space classiﬁer should in principle be as asymptotically

optimal as the x-space classiﬁer, i.e.

R(x) = pdata(x)

pbg(x)=pdata(z)

pbg(z)=R(z) (3)

The performance of the Cathode and LaCathode

methods similarly rely on the quality of the trained and

interpolated normalizing ﬂow. For Cathode it con-

trols the ﬁdelity of the pbg (x) estimate, whereas for

LaCathode the ﬂow is responsible for pdata(z). If the

background events in data were not mapped to a unit

normal distribution, both the learning of the likelihood

ratio via the classiﬁer and the decorrelation of auxiliary

features from the resonant one would deteriorate.

We will show with examples in the following sections

that LaCathode retains much of the excellent signal sen-

sitivity as Cathode, while avoiding the sculpting of the

mdistribution in the presence of correlations.3

While LaCathode seems to be the superior en-

hanced bump hunting method, all is not lost for original

Cathode—it remains a robust and powerful anomaly

detection method as long as the correlations are suﬃ-

ciently small. This is the case for the original feature set

of the LHC Olympics 2020 R&D dataset—as discussed

in [12], these have percent-level correlations with m, and

we showed there that Cathode signal sensitivity remains

robust to this small correlation (unlike CWoLa-Hunting,

which is more fragile to correlations). In this paper we

demonstrate that the mdistribution is also not sculpted

after a cut on R(x).

III. TOY MODEL

We begin by demonstrating the idea of LaCathode

with a simple 1+2D toy model. In the ﬁrst part, we will

investigate how correlations between xand maﬀect the

2A closely related approach would be to use the invertible map

ftwice to decorrelate xfrom min the SB regions: x→x′=

f−1(f(x, m), m0) for some suitable choice of m0∈SR. Then one

could apply the anomaly score to x′and mitigate the background

sculpting issue. We thank B. Nachman for this suggestion. We

also observe that a similar map x→x′is available directly from

the Curtains method, without having to pass through the latent

space; this could be used to prevent background sculpting in the

Curtains method.

3Another minor advantage of mapping the SR data to the la-

tent space for the classiﬁcation task is that the values of mfor

this transformation are directly available, which simpliﬁes things

somewhat. In the case of Cathode, the mapping from the latent

space samples to the data space via f−1(z;m) needs a separate

estimation of the SR mdensity.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ResonantanomalydetectionwithoutbackgroundsculptingAnnaHallin,1,∗GregorKasieczka,2,3,†TobiasQuadfasel,2,‡DavidShih,1,§andManuelSommerhalder2,¶1NHETC,Dept.ofPhysicsandAstronomy,RutgersUniversity,Piscataway,NJ08854,USA2Institutf¨urExperimentalphysik,Universit¨atHamburg,22761Hamburg,Germany3CenterforDat...

展开>> 收起<<

Resonant anomaly detection without background sculpting Anna Hallin1Gregor Kasieczka2 3Tobias Quadfasel2David Shih1and Manuel Sommerhalder2 1NHETC Dept. of Physics and Astronomy Rutgers University Piscataway NJ 08854 USA.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Resonant anomaly detection without background sculpting Anna Hallin1Gregor Kasieczka2 3Tobias Quadfasel2David Shih1and Manuel Sommerhalder2 1NHETC Dept. of Physics and Astronomy Rutgers University Piscataway NJ 08854 USA

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: