The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization Luca Bonglioli0000000273230662 Marco Toschi0000000265732857

2025-05-06 0 0 5.5MB 17 页 10玖币

侵权投诉

The Eyecandies Dataset for Unsupervised

Multimodal Anomaly Detection and Localization

Luca Bonﬁglioli?[0000−0002−7323−0662], Marco Toschi?[0000−0002−6573−2857],

Davide Silvestri[0000−0003−0727−7785], Nicola Fioraio[0000−0001−9969−0555], and

Daniele De Gregorio[0000−0001−8203−9176]

Eyecan.ai {luca.bonfiglioli, marco.toschi, davide.silvestri,

nicola.fioraio, daniele.degregorio}@eyecan.ai https://www.eyecan.ai

Abstract. We present Eyecandies, a novel synthetic dataset for unsu-

pervised anomaly detection and localization. Photo-realistic images of

procedurally generated candies are rendered in a controlled environment

under multiple lightning conditions, also providing depth and normal

maps in an industrial conveyor scenario. We make available anomaly-

free samples for model training and validation, while anomalous instances

with precise ground-truth annotations are provided only in the test set.

The dataset comprises ten classes of candies, each showing diﬀerent chal-

lenges, such as complex textures, self-occlusions and specularities. Fur-

thermore, we achieve large intra-class variation by randomly drawing

key parameters of a procedural rendering pipeline, which enables the

creation of an arbitrary number of instances with photo-realistic appear-

ance. Likewise, anomalies are injected into the rendering graph and pixel-

wise annotations are automatically generated, overcoming human-biases

and possible inconsistencies.

We believe this dataset may encourage the exploration of original ap-

proaches to solve the anomaly detection task, e.g. by combining color,

depth and normal maps, as they are not provided by most of the exist-

ing datasets. Indeed, in order to demonstrate how exploiting additional

information may actually lead to higher detection performance, we show

the results obtained by training a deep convolutional autoencoder to

reconstruct diﬀerent combinations of inputs.

Keywords: Synthetic Dataset ·Anomaly Detection ·Deep Learning.

1 Introduction

Recent years have seen an increasing interest in visual unsupervised anomaly

detection [34], the task of determining whether an example never seen before

presents any aspects that deviate from a defect-free domain, which was learned

during training. Similar to one-class classiﬁcation [23,18,27], in unsupervised

anomaly detection the model has absolutely no knowledge of the appearance of

anomalous structures and must learn to detect them solely by looking at good

?Joint ﬁrst authorship.

arXiv:2210.04570v1 [cs.CV] 10 Oct 2022

2 Bonﬁglioli et al.

examples. There is a practical reason behind this apparent limitation: being

anomalies rare by deﬁnition, collecting real-world data with enough examples

of each possible deviation from a target domain may prove to be unreasonably

expensive. Furthermore, the nature of all possible anomalies might even be un-

known, so treating anomaly detection as a supervised classiﬁcation task may

hinder the ability of the model to generalize to new unseen types of defects.

Historically, a common evaluation practice for proposed AD methods was to

exploit existing multi-class classiﬁcation datasets, such as MNIST [21] and CI-

FAR [20], re-labeling a subset of related classes as inliers and the remaining as

outliers [25]. The major drawback of this practice is that clean and anomalous

domains are often completely unrelated, whereas in real-world scenarios, such as

industrial quality assurance or autonomous driving, anomalies usually appear as

subtle changes within a common scene, as for the anomalies presented in [6]. In

the recent years this adaptation of classiﬁcation datasets was discouraged in fa-

vor of using new datasets speciﬁcally designed for visual anomaly detection and

localization, such as [9], which focuses on industrial inspection. However, most

of the available datasets provide only color images with ground-truth annota-

tions and very few add 3D information [8]. Moreover, all of them have to face

the problem of manual labelling, which can be human-biased and error-prone,

especially in the 3D domain.

The Eyecandies dataset is our main contribution to tackle these issues and

provide a new and challenging benchmark for unsupervised anomaly detec-

tion, including a total of 90000 photo-realistic shots of procedurally generated

synthetic objects, spanning across 10 classes of candies, cookies and sweets

(cfr. Fig. 1). Diﬀerent classes present entirely diﬀerent shapes, color patterns

and materials, while intra-class variance is given by randomly altering param-

eters of the same model. The Eyecandies dataset comprises defect-free samples

for training, as well as anomalous ones used for testing, each of them with au-

tomatically generated per-pixel ground-truth labels, thus removing the need for

expensive (and often biased) manual annotation procedures. Of each sample, we

also provide six renderings with diﬀerent controlled lighting conditions, together

with ground-truth depth and normal maps, encouraging the exploration and

comparison of many alternative approaches.

We found that performance of existing methods on synthetic data are in line

with the results obtained on real data, such as [9], though our dataset appears

to be more challenging. Moreover, being the use of 3D data not common in the

AD ﬁeld, we deployed a deep convolutional autoencoder trained to reconstruct

diﬀerent combination of inputs, showing that the inclusion of 3D data results in

better anomaly detection and localization performance.

To explore the data and evaluate a method, please go to https://eyecan-

ai.github.io/eyecandies. Please refer to https://github.com/eyecan-ai/eyecandies

for examples and tutorials on how to use the Eyecandies dataset.

The Eyecandies Dataset 3

Fig. 1: Examples from the Eyecandies dataset. Each row shows good and bad

samples from the same object category (best viewed in color).

2 Related Work

Anomaly detection and localization on images (hereinafter AD) is an ubiquitous

theme in many ﬁelds, from autonomous driving [12] to visual industrial inspec-

tion [33,30,9]. Likewise, the use of synthetic datasets to evaluate the performance

of proposed methods has been already explored in many contexts [32,13,29,17].

However, very few works investigate how synthetic data can be eﬀectively ex-

ploited to advance the AD ﬁeld, which is indeed the focus of the dataset we are

presenting. In the next sections we will ﬁrst review the publicly available datasets

for AD, then we will brieﬂy analyze the most successful methods proposed to

solve the AD task, showing how our work may help the research community.

2.1 Anomaly Detection Datasets

Diﬀerent public AD datasets exists, some designed for industrial inspection of

a very specialized type of objects, while other trying to be more generic. An

example of the former group is the Magnetic Tile Dataset [33], composed by

952 anomaly-free images and 5 types of anomalies, for a total of 1501 manually

annotated images of various resolutions. Despite being a reference in the ﬁeld,

this dataset comprises only a single texture category and it is limited to grayscale

images. Though much larger, [30] is another similar dataset presented on Kaggle,

focused on a single object class.

4 Bonﬁglioli et al.

The NanoTWICE dataset [14] provides high resolution images (1024x696),

although of little interest for deep learning approaches, since it is composed by

only 5 anomaly-free images and 40 images with anomalies of diﬀerent sizes.

In [22] the authors generate a synthetic dataset of 1000 good images and 150

anomalous images, with ground-truth labels approximated by ellipses. The test

set comprises 2000 non defective images and 300 defective ones as 8-bit grayscale

with a resolution of 512x512. Though larger than usual, it shows low texture

variation and the ground-truth is very coarse. Instead, our synthetic pipeline

aims at photo-realistic images with large intra-class variation and pixel-precise

ground-truth masks.

MVTec AD [9], which is focused on the industrial inspection scenario, fea-

tures a total of 5354 real-world images, spanning across 5 texture and 10 object

categories. The test set includes 73 distinct types of anomalies (on average 5 per

category) with a total of 1725 images. Anomalous regions have been manually

annotated, though introducing small inconsistencies and unclear resolution of

missing object parts. In our work we purposely avoid these undeﬁned situations,

while providing pixel-precise annotations in an automated way.

MVTec LOCO AD [6] introduces the concept of “structural” and “logical”

anomalies: the former being local irregularities like scratches or dents, and the

latter being violations of underlying logical constraints that require a deeper

understanding of the scene. The dataset consists of 3644 images, distributed

across 6 categories. Though interesting and challenging, the detection of logical

anomalies is out of the scope of this work, where we focus on localized defects

only. Moreover, such defects are usually speciﬁc for a particular object class,

while we aim at automated and consistent defect generation. Finally, being the

subject fairly new, there is no clear consensus on how to annotate the images

and evaluate the performance of a method.

MVTec 3D-AD [8] has been the ﬁrst 3D dataset for AD. Authors believe

that the use of 3D data is not common in the AD ﬁeld due to the lack of

suitable datasets. They provide 4147 point clouds, acquired by an industrial 3D

sensor, and a complementary RGB image for 10 object categories. The test set

comprises 948 anomalous objects and 41 types of defects, all manually annotated.

The objects are captured on a black background, useful for data augmentation,

but not very common in real-world scenarios. Moreover, the use of a 3D device

caused the presence of occlusions, reﬂections and inaccuracies, introducing a

source of noise that may hinder a fair comparison of diﬀerent AD proposals. Of

course, our synthetic generation does not suﬀer from such nuisances.

Synthetic generation of defective samples is introduced in [13] to enhance the

performance of an AD classiﬁer. As in our work, they use Blender [11] to create

the new data, though they focus on combining real and synthetic images, while

we aim at providing a comprehensive dataset for evaluation and comparison.

Also, the authors of [13] did not release their dataset.

In [29] another non-publicly available dataset is presented. They render 2D

images from 3D models in a procedural way, where randomized parameters con-

trol defects, illumination, camera poses and texture. Their rendering pipeline is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TheEyecandiesDatasetforUnsupervisedMultimodalAnomalyDetectionandLocalizationLucaBonglioli?[0000000273230662],MarcoToschi?[0000000265732857],DavideSilvestri[0000000307277785],NicolaFioraio[0000000199690555],andDanieleDeGregorio[0000000182039176]Eyecan.aifluca.bonfiglioli,marco.toschi,davide.silvestr...

展开>> 收起<<

The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization Luca Bonglioli0000000273230662 Marco Toschi0000000265732857.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization Luca Bonglioli0000000273230662 Marco Toschi0000000265732857

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: