The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization Luca Bonglioli0000000273230662 Marco Toschi0000000265732857

2025-05-06 0 0 5.5MB 17 页 10玖币
侵权投诉
The Eyecandies Dataset for Unsupervised
Multimodal Anomaly Detection and Localization
Luca Bonfiglioli?[0000000273230662], Marco Toschi?[0000000265732857],
Davide Silvestri[0000000307277785], Nicola Fioraio[0000000199690555], and
Daniele De Gregorio[0000000182039176]
Eyecan.ai {luca.bonfiglioli, marco.toschi, davide.silvestri,
nicola.fioraio, daniele.degregorio}@eyecan.ai https://www.eyecan.ai
Abstract. We present Eyecandies, a novel synthetic dataset for unsu-
pervised anomaly detection and localization. Photo-realistic images of
procedurally generated candies are rendered in a controlled environment
under multiple lightning conditions, also providing depth and normal
maps in an industrial conveyor scenario. We make available anomaly-
free samples for model training and validation, while anomalous instances
with precise ground-truth annotations are provided only in the test set.
The dataset comprises ten classes of candies, each showing different chal-
lenges, such as complex textures, self-occlusions and specularities. Fur-
thermore, we achieve large intra-class variation by randomly drawing
key parameters of a procedural rendering pipeline, which enables the
creation of an arbitrary number of instances with photo-realistic appear-
ance. Likewise, anomalies are injected into the rendering graph and pixel-
wise annotations are automatically generated, overcoming human-biases
and possible inconsistencies.
We believe this dataset may encourage the exploration of original ap-
proaches to solve the anomaly detection task, e.g. by combining color,
depth and normal maps, as they are not provided by most of the exist-
ing datasets. Indeed, in order to demonstrate how exploiting additional
information may actually lead to higher detection performance, we show
the results obtained by training a deep convolutional autoencoder to
reconstruct different combinations of inputs.
Keywords: Synthetic Dataset ·Anomaly Detection ·Deep Learning.
1 Introduction
Recent years have seen an increasing interest in visual unsupervised anomaly
detection [34], the task of determining whether an example never seen before
presents any aspects that deviate from a defect-free domain, which was learned
during training. Similar to one-class classification [23,18,27], in unsupervised
anomaly detection the model has absolutely no knowledge of the appearance of
anomalous structures and must learn to detect them solely by looking at good
?Joint first authorship.
arXiv:2210.04570v1 [cs.CV] 10 Oct 2022
2 Bonfiglioli et al.
examples. There is a practical reason behind this apparent limitation: being
anomalies rare by definition, collecting real-world data with enough examples
of each possible deviation from a target domain may prove to be unreasonably
expensive. Furthermore, the nature of all possible anomalies might even be un-
known, so treating anomaly detection as a supervised classification task may
hinder the ability of the model to generalize to new unseen types of defects.
Historically, a common evaluation practice for proposed AD methods was to
exploit existing multi-class classification datasets, such as MNIST [21] and CI-
FAR [20], re-labeling a subset of related classes as inliers and the remaining as
outliers [25]. The major drawback of this practice is that clean and anomalous
domains are often completely unrelated, whereas in real-world scenarios, such as
industrial quality assurance or autonomous driving, anomalies usually appear as
subtle changes within a common scene, as for the anomalies presented in [6]. In
the recent years this adaptation of classification datasets was discouraged in fa-
vor of using new datasets specifically designed for visual anomaly detection and
localization, such as [9], which focuses on industrial inspection. However, most
of the available datasets provide only color images with ground-truth annota-
tions and very few add 3D information [8]. Moreover, all of them have to face
the problem of manual labelling, which can be human-biased and error-prone,
especially in the 3D domain.
The Eyecandies dataset is our main contribution to tackle these issues and
provide a new and challenging benchmark for unsupervised anomaly detec-
tion, including a total of 90000 photo-realistic shots of procedurally generated
synthetic objects, spanning across 10 classes of candies, cookies and sweets
(cfr. Fig. 1). Different classes present entirely different shapes, color patterns
and materials, while intra-class variance is given by randomly altering param-
eters of the same model. The Eyecandies dataset comprises defect-free samples
for training, as well as anomalous ones used for testing, each of them with au-
tomatically generated per-pixel ground-truth labels, thus removing the need for
expensive (and often biased) manual annotation procedures. Of each sample, we
also provide six renderings with different controlled lighting conditions, together
with ground-truth depth and normal maps, encouraging the exploration and
comparison of many alternative approaches.
We found that performance of existing methods on synthetic data are in line
with the results obtained on real data, such as [9], though our dataset appears
to be more challenging. Moreover, being the use of 3D data not common in the
AD field, we deployed a deep convolutional autoencoder trained to reconstruct
different combination of inputs, showing that the inclusion of 3D data results in
better anomaly detection and localization performance.
To explore the data and evaluate a method, please go to https://eyecan-
ai.github.io/eyecandies. Please refer to https://github.com/eyecan-ai/eyecandies
for examples and tutorials on how to use the Eyecandies dataset.
The Eyecandies Dataset 3
Fig. 1: Examples from the Eyecandies dataset. Each row shows good and bad
samples from the same object category (best viewed in color).
2 Related Work
Anomaly detection and localization on images (hereinafter AD) is an ubiquitous
theme in many fields, from autonomous driving [12] to visual industrial inspec-
tion [33,30,9]. Likewise, the use of synthetic datasets to evaluate the performance
of proposed methods has been already explored in many contexts [32,13,29,17].
However, very few works investigate how synthetic data can be effectively ex-
ploited to advance the AD field, which is indeed the focus of the dataset we are
presenting. In the next sections we will first review the publicly available datasets
for AD, then we will briefly analyze the most successful methods proposed to
solve the AD task, showing how our work may help the research community.
2.1 Anomaly Detection Datasets
Different public AD datasets exists, some designed for industrial inspection of
a very specialized type of objects, while other trying to be more generic. An
example of the former group is the Magnetic Tile Dataset [33], composed by
952 anomaly-free images and 5 types of anomalies, for a total of 1501 manually
annotated images of various resolutions. Despite being a reference in the field,
this dataset comprises only a single texture category and it is limited to grayscale
images. Though much larger, [30] is another similar dataset presented on Kaggle,
focused on a single object class.
4 Bonfiglioli et al.
The NanoTWICE dataset [14] provides high resolution images (1024x696),
although of little interest for deep learning approaches, since it is composed by
only 5 anomaly-free images and 40 images with anomalies of different sizes.
In [22] the authors generate a synthetic dataset of 1000 good images and 150
anomalous images, with ground-truth labels approximated by ellipses. The test
set comprises 2000 non defective images and 300 defective ones as 8-bit grayscale
with a resolution of 512x512. Though larger than usual, it shows low texture
variation and the ground-truth is very coarse. Instead, our synthetic pipeline
aims at photo-realistic images with large intra-class variation and pixel-precise
ground-truth masks.
MVTec AD [9], which is focused on the industrial inspection scenario, fea-
tures a total of 5354 real-world images, spanning across 5 texture and 10 object
categories. The test set includes 73 distinct types of anomalies (on average 5 per
category) with a total of 1725 images. Anomalous regions have been manually
annotated, though introducing small inconsistencies and unclear resolution of
missing object parts. In our work we purposely avoid these undefined situations,
while providing pixel-precise annotations in an automated way.
MVTec LOCO AD [6] introduces the concept of “structural” and “logical”
anomalies: the former being local irregularities like scratches or dents, and the
latter being violations of underlying logical constraints that require a deeper
understanding of the scene. The dataset consists of 3644 images, distributed
across 6 categories. Though interesting and challenging, the detection of logical
anomalies is out of the scope of this work, where we focus on localized defects
only. Moreover, such defects are usually specific for a particular object class,
while we aim at automated and consistent defect generation. Finally, being the
subject fairly new, there is no clear consensus on how to annotate the images
and evaluate the performance of a method.
MVTec 3D-AD [8] has been the first 3D dataset for AD. Authors believe
that the use of 3D data is not common in the AD field due to the lack of
suitable datasets. They provide 4147 point clouds, acquired by an industrial 3D
sensor, and a complementary RGB image for 10 object categories. The test set
comprises 948 anomalous objects and 41 types of defects, all manually annotated.
The objects are captured on a black background, useful for data augmentation,
but not very common in real-world scenarios. Moreover, the use of a 3D device
caused the presence of occlusions, reflections and inaccuracies, introducing a
source of noise that may hinder a fair comparison of different AD proposals. Of
course, our synthetic generation does not suffer from such nuisances.
Synthetic generation of defective samples is introduced in [13] to enhance the
performance of an AD classifier. As in our work, they use Blender [11] to create
the new data, though they focus on combining real and synthetic images, while
we aim at providing a comprehensive dataset for evaluation and comparison.
Also, the authors of [13] did not release their dataset.
In [29] another non-publicly available dataset is presented. They render 2D
images from 3D models in a procedural way, where randomized parameters con-
trol defects, illumination, camera poses and texture. Their rendering pipeline is
摘要:

TheEyecandiesDatasetforUnsupervisedMultimodalAnomalyDetectionandLocalizationLucaBon glioli?[0000000273230662],MarcoToschi?[0000000265732857],DavideSilvestri[0000000307277785],NicolaFioraio[0000000199690555],andDanieleDeGregorio[0000000182039176]Eyecan.aifluca.bonfiglioli,marco.toschi,davide.silvestr...

展开>> 收起<<
The Eyecandies Dataset for Unsupervised Multimodal Anomaly Detection and Localization Luca Bonglioli0000000273230662 Marco Toschi0000000265732857.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:17 页 大小:5.5MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注