Anomaly Detection Requires Better Representations Tal Reiss Niv Cohen Eliahu Horwitz Ron Abutbul and Yedid Hoshen

2025-04-30 0 0 1.46MB 13 页 10玖币

侵权投诉

Anomaly Detection Requires Better

Representations

Tal Reiss, Niv Cohen, Eliahu Horwitz, Ron Abutbul, and Yedid Hoshen

School of Computer Science and Engineering

The Hebrew University of Jerusalem, Israel

http://www.vision.huji.ac.il/ssrl_ad/

Abstract. Anomaly detection seeks to identify unusual phenomena, a

central task in science and industry. The task is inherently unsuper-

vised as anomalies are unexpected and unknown during training. Recent

advances in self-supervised representation learning have directly driven

improvements in anomaly detection. In this position paper, we ﬁrst ex-

plain how self-supervised representations can be easily used to achieve

state-of-the-art performance in commonly reported anomaly detection

benchmarks. We then argue that tackling the next generation of anomaly

detection tasks requires new technical and conceptual improvements in

representation learning.

Keywords: Anomaly Detection, Self-Supervised Learning, Representa-

tion Learning

1 Introduction

Discovery commences with the awareness of anomaly, i.e., with the recog-

nition that nature has somehow violated the paradigm-induced expecta-

tions that govern normal science.

——–Kuhn, The Structure of Scientiﬁc Revolutions (1970)

I do not know what I may appear to the world, but to myself I seem to

have been only like a boy playing on the seashore, and diverting myself in

now and then ﬁnding a smoother pebble or a prettier shell than ordinary,

whilst the great ocean of truth lay all undiscovered before me.

——–Isaac Newton

Anomaly detection, discovering unusual patterns in data, is a core task for

human and machine intelligence. The importance of the task stems from the

centrality of discovering unique or unusual phenomena in science and industry.

For example, the ﬁelds of particle physics and cosmology have, to large extent,

been driven by the discovery of new fundamental particles and stellar objects.

Similarly, the discovery of new, unknown, biological organisms and systems is

arXiv:2210.10773v1 [cs.LG] 19 Oct 2022

2 T. Reiss et al.

a driving force behind biology. The task is also of signiﬁcant economic poten-

tial. Anomaly detection methods are used to detect credit card fraud, faults on

production lines, and unusual patterns in network communications.

Detecting anomalies is essentially unsupervised as only ”normal” data, but no

anomalies, are seen during training. While the ﬁeld has been intensely researched

for decades, the most successful recent approaches use a very simple two-stage

paradigm: (i) each data point is transformed to a representation, often learned

in a self-supervised manner. (ii) a density estimation model, often as simple as a

knearest neighbor estimator, is ﬁtted to the normal data provided in a training

set. To classify a new sample as normal or anomalous, its estimated probability

density is computed - low likelihood samples are denoted as anomalies.

In this position paper, we ﬁrst explain that advances in representation learn-

ing are the main explanatory factor for the performance of recent anomaly de-

tection (AD) algorithms. We show that this paradigm essentially ”solves” the

most commonly reported image anomaly detection benchmark (Sec. 4). While

this is encouraging, we argue that existing self-supervised representations are

unable to solve the next generation of AD tasks (Sec. 5). In particular, we high-

light the following issues: (i) masked-autoencoders are much worse for AD than

earlier self-supervised representation learning (SSRL) methods (ii) current ap-

proaches perform poorly in datasets with multiple objects per-image, complex

background, ﬁne-grained anomalies. (iii) in some cases SSRL performs worse

than handcrafted representations (iv) for ”tabular” datasets, no representation

performed better than the original representation of the data (i.e. that data it-

self) (v) in the presence of nuisance factors of variation, it is unclear whether

SSRL can in-principle identify the optimal representation for eﬀective AD.

Anomaly detection presents both rich rewards as well as signiﬁcant chal-

lenges for representation learning. Overcoming these issues will require signiﬁ-

cant progress, both technical and conceptual. We expect that increasing the in-

volvement of the self-supervised representation learning community in anomaly

detection will mutually beneﬁt both ﬁelds.

2 Related Work

Classical AD approaches were typically based on either density estimation [9,20]

or reconstruction [15]. With the advent of deep learning, classical methods were

augmented by deep representations [23,38,19,24]. A prevalent way to learn these

representations was to use self-supervised methods, e.g. autoencoder [30], rota-

tion classiﬁcation [10,13], and contrastive methods [36,35]. An alternative ap-

proach is to combine pretrained representations with anomaly scoring functions

[25,32,27,28]. The best performing methods [27,28] combine pretraining on aux-

iliary datasets and a second ﬁnetuning stage on the provided normal samples

in the training set. It was recently established [27] that given suﬃciently pow-

erful representations (e.g. ImageNet classiﬁcation), a simple criterion based on

the kNN distance to the normal training data achieves strong performance. We

therefore limit the discussion of AD in this paper to this simple technique.

Anomaly Detection Requires Better Representations 3

Fig. 1. Normal and Anomalous Representations: The self-supervised representa-

tions transform the raw data into a space in which normal and anomalous data can be

easily separated using density estimation methods

3 Anomaly Detection as a Downstream Task for

Representation Learning

In this section we describe the computational task, method, and evaluation set-

ting for anomaly detection.

Task deﬁnition. We assume access to Nrandom samples, denoted by

Xtrain ={x1, x2...xN}, from the distribution of the normal data pnorm(x). At

test time, the algorithm observes a sample ˜xfrom the real-world distribution

preal (x), which consists of a combination of the normal and anomalous data

distributions: pnorm (x) and panom(x). The task is to classify the sample ˜xas

normal or anomalous.

Representations for anomaly detection. In AD, it is typically assumed

that anomalies a∼panom have a low likelihood under the normal data distri-

bution, i.e. that pnorm (a) is small. Under this assumption, the PDF of normal

data pnorm acts as an eﬀective anomaly classiﬁer. In practice, however, train-

ing an estimator qfor scoring anomalies using pnorm is a challenging statistical

task. The challenge is greater when: (i) the data are high-dimensional (e.g. im-

ages) (ii) pnorm is sparse or irregular (iii) normal and anomalous data are not

separable using simple functions. Representation learning may overcome these

issues by transforming the sample xinto a representation ϕ(x), which is of lower

dimension, where pnorm is relatively smooth and where normal and anomalous

data are more separable. As no anomaly labels are provided, self-supervised

representation learning is needed.

A two-stage anomaly detection paradigm. Given a self-supervised rep-

resentation ϕ, we follow a simple two stage anomaly detection paradigm: (i) Rep-

resentation encoder : each sample during training or test is mapped to a feature

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnomalyDetectionRequiresBetterRepresentationsTalReiss,NivCohen,EliahuHorwitz,RonAbutbul,andYedidHoshenSchoolofComputerScienceandEngineeringTheHebrewUniversityofJerusalem,Israelhttp://www.vision.huji.ac.il/ssrl_ad/Abstract.Anomalydetectionseekstoidentifyunusualphenomena,acentraltaskinscienceandindust...

展开>> 收起<<

Anomaly Detection Requires Better Representations Tal Reiss Niv Cohen Eliahu Horwitz Ron Abutbul and Yedid Hoshen.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Anomaly Detection Requires Better Representations Tal Reiss Niv Cohen Eliahu Horwitz Ron Abutbul and Yedid Hoshen

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: