Reconstruction from edge image combined with color and gradient difference for industrial surface anomaly detection

2025-05-01 0 0 7.5MB 11 页 10玖币
侵权投诉
Reconstruction from edge image combined with color and gradient difference for
industrial surface anomaly detection
Tongkun Liua, Bing Lia,b, Zhuo Zhaoa,
, Xiao Dua, Bingke Jianga, Leqi Genga
aState Key Laboratory for Manufacturing System Engineering, Xi’an Jiaotong University,No.99 Yanxiang Road, Yanta District, 710054,
Xi’an, Shaanxi, China
bInternational Joint Research Laboratory for Micro/Nano Manufacturing and Measurement Technologies, Xi’an Jiaotong University,No.99
Yanxiang Road, Yanta District, 710054, Xi’an, Shaanxi, China
Abstract
Reconstruction-based methods are widely explored in industrial visual anomaly detection. Such methods commonly
require the model to well reconstruct the normal patterns but fail in the anomalies, and thus the anomalies can be
detected by evaluating the reconstruction errors. However, in practice, it’s usually difficult to control the generalization
boundary of the model. The model with an overly strong generalization capability can even well reconstruct the abnormal
regions, making them less distinguishable, while the model with a poor generalization capability can not reconstruct
those changeable high-frequency components in the normal regions, which ultimately leads to false positives. To tackle
the above issue, we propose a new reconstruction network where we reconstruct the original RGB image from its gray
value edges (EdgRec). Specifically, this is achieved by an UNet-type denoising autoencoder with skip connections. The
input edge and skip connections can well preserve the high-frequency information in the original image. Meanwhile, the
proposed restoration task can force the network to memorize the normal low-frequency and color information. Besides,
the denoising design can prevent the model from directly copying the original high-frequent components. To evaluate
the anomalies, we further propose a new interpretable hand-crafted evaluation function that considers both the color
and gradient differences. Our method achieves competitive results on the challenging benchmark MVTec AD (97.8% for
detection and 97.7% for localization, AUROC). In addition, we conduct experiments on the MVTec 3D-AD dataset and
show convincing results using RGB images only. Our code will be available at https://github.com/liutongkun/EdgRec.
Keywords: Anomaly detection, Surface defect detection, Denoising autoencoder, MVTec AD, MVTec 3D
1. Introduction
Industrial surface anomaly detection aims at identifying
all types of visible defects that possibly occur during man-
ufacturing. It plays an important role in manufacturing
quality control and has received more and more attention
in recent years. Although existing supervised models have
achieved good performance in many vision tasks, they are
not widely adopted in industrial surface defect detection.
The main reason is that a qualified production line can
not produce so many defective samples to train such su-
pervised models. More importantly, supervised models are
likely to fail on those unseen types of defects that are not
included in the training set. However, the above issues
can be well mitigated by unsupervised anomaly detection
models. For one thing, there are always adequate nor-
mal samples in the production line, which are sufficient
to train an anomaly detection model without any defec-
tive samples. For another, the model trained on normal
samples only captures the distribution of the normal data.
Corresponding author at: School of Mechanical Engineering,
Xi’an Jiaotong University, Xi’an, Shaanxi, China.
Therefore, ideally, it’s able to detect any unknown defects
whose distributions deviate from the normal distribution.
Existing methods for industrial visual anomaly detec-
tion can be mainly divided into feature-based methods
and reconstruction-based methods. Feature-based meth-
ods project the original image into a more distinguishable
feature space through ImageNet pre-trained networks or
self-supervised tasks. Generally, these methods achieve
higher performance than reconstruction-based methods
while they are less interpretable and adjustable since it’s
difficult for engineers to read from those abstract fea-
ture vectors. Reconstruction-based methods assume that
the model trained on normal samples can only well re-
construct the normal patterns but fail in anomalies, and
therefore the anomalies can be detected by comparing the
original and reconstructed images through the anomaly
evaluation function. Compared to feature-based methods,
reconstruction-based methods are easier to understand vi-
sually because one can directly observe the differences be-
tween the original and reconstructed image. Through de-
signing the specific comparing functions, such methods can
be easily adjusted for specific situations with human prior
knowledge (e.g., if we only need to detect color anomalies,
arXiv:2210.14485v1 [cs.CV] 26 Oct 2022
then we can just design the color comparing function and
ignore other differences).
Currently, most reconstruction-based methods are un-
derperforming since it’s hard to control the boundary of
the model’s generalization capability. Concretely, a model
which can not reconstruct the anomalies is also likely to
fail in reconstructing those variable normal patterns, and
vice versa, an overly generalizable model may generalize to
those anomalies, i.e., reconstruct the anomalies well and
thus makes them less distinguishable. Besides, the im-
age will inevitably be degraded (e.g., generating blurred
results in variable regions) during the reconstruction pro-
cess, which bring challenges to the design of anomaly eval-
uation function. Directly using pixel-level l2distance to
compare the original and reconstructed image is usually
unfavorable since it may cause many false alarms in those
degraded normal regions.
Considering the above issues, this paper propose to
boost the performance of the reconstruction-based meth-
ods from two aspects. First, we propose to reconstruct the
image from its gray value edge, which is motivated by [1].
Since the edge retains only partial contents of the original
image, the network needs to generate the removed normal
low-frequency and color contents during training. When
testing the abnormal images, the model is less likely to
generate accurate abnormal patterns as it only sees partial
contents (the abnormal edge) of the image. Meanwhile, the
input edge preserves the important original high-frequency
components, which are usually the hardest parts to be re-
constructed [2]. We further use skip connections to reduce
the loss of those components in the down-sampling pro-
cess. Consequently, the edge and skip connections can
help better reconstruct those complex high-frequency nor-
mal regions and therefore yield fewer false alarms. On the
other hand, the above operations may cause the model di-
rectly copy the edge from the input. Therefore, we use a
denoising autoencoder design to corrupt the original edge
with multi scales pseudo anomalies to avoid an identity
mapping on the edge region. Fig.1illustrates the main
structure of our reconstruction network.
Second, we propose a new color-based evaluation func-
tion and combined it with the existing gradient-based
function[3,4] as our anomaly evaluation function. This
function can effectively detect anomalies while reducing
false alarms caused by image degradation in normal areas.
Finally, our contributions are summarized as follows:
(1) We propose a new reconstruction method for indus-
trial visual anomaly detection where we reconstruct the
images from their edges. Our specific design can effec-
tively control the generalization capability of the model
between anomalies and normal regions
(2) We propose a new color-based anomaly evaluation
function to detect color anomalies for reconstruction-based
methods. Our function can effectively detect color anoma-
lies and is insensitive to light changes.
(3) We achieve comparable results on the challenging
benchmark MVTec AD and the MVTec 3D-AD dataset
Encoder Decoder
Loss
Back propagation
Encoder Decoder
Comparing
Function
Training
Testing
Fig. 1. For training phase, we first corrupt the original image I
with certain noise and thus get IA; then we convert it to grayscale
image Ig
Aand extract the edge Ie
A. Our training goal is to make
the reconstructed image IRas close as possible to the I. For testing
phase, we extract the grayscale edge Ieof the original test image
Iand reconstruct it to the RGB image IR. The anomaly map Ais
obtained by comparing the original and the reconstructed images via
the compare function.
(using RGB images only) for both anomaly detection
and segmentation. Specifically, our anomaly evaluation
function is totally hand-crafted and therefore more inter-
pretable and adjustable compared to those latent feature-
based evaluation functions (e.g., the separate discrimina-
tive network or the perceptual loss).
2. Related Work
Visual anomaly detection is a widely studied topic
with applications ranging from medical diagnosis[5],
surveillance[6], industrial inspection[7], etc., where there
are usually adequate normal samples while abnormal sam-
ples are rare and diverse. In this paper, we focus on its ap-
plication in industrial surface defect detection. This task
may be more challenging since it requires the model to
not only identify whether there exist anomalies in the im-
age (anomaly detection) but also accurately locate the ab-
normal areas (anomaly segmentation). Bergmann et al.
(2019) [8] propose the MVTec AD (detailed in Sec 4.1),
a comprehensive dataset for industrial visual anomaly de-
tection including 15 different industrial products. This
dataset quickly became the most convincing benchmark in
industrial visual anomaly detection and has sparked much
research. Here we mainly divided the existing research
into two groups: feature-based and reconstruction-based.
2.1. Feature based methods
Feature-based methods aim to find a feature space
where the normal and abnormal features are fully distin-
guishable. Since there exists no abnormal sample dur-
ing training, it’s preferable to leverage the ImageNet pre-
2
trained network[915] or the models obtained through self-
supervised tasks[1618], as feature extractors. For the pre-
trained network, several studies[10,15] have found that
it’s important to select appropriate hierarchy levels of fea-
tures, because the low-level features lack global awareness,
while the extremely high-level features may be biased to
the pre-trained task itself. Also, the pre-trained network
can be used as a teacher network to detect anomalies by
knowledge distillation[9]. For the self-supervised based
methods, the key is to design suitable auxiliary tasks.
Li et al.[16] propose to use Cutpaste augmentation to
train a one-class classifier. Other auxiliary tasks includes
the position prediction[17], the geometric transformation
prediction[18], etc.
Overall, benefiting from the powerful representation
capabilities of deep features, feature-based methods
can achieve better performance compared to existing
reconstruction-based methods. In particular, [10] achieves
state-of-the-art performance on the MVTec AD. However,
these methods are hard to be optimized for the specific
situation since those deep features are too abstract to in-
troduce prior knowledge.
2.2. Reconstruction based methods
Reconstruction-based methods commonly leverage gen-
erative models such as autoencoders[4,19,20], VAEs[21],
GANs[5,22], etc., to detect anomalies in the image space.
Generally, these methods contain two steps: 1. Recon-
struct the image; 2. Compare the original and recon-
structed images to get anomaly maps.
Reconstruct the image. Early works mainly lever-
age denoising autoencoders[20,23,24] to help the network
better capture the normal distribution and avoid learning
an identity mapping. In the training phase, these methods
corrupt the original image with certain noise and make the
network eliminate it. In addition to some low-level noise
like Gaussian noise, cutout, stain, etc., the image can also
be corrupted by some semantic transformations, like the
geometric transformation[2527], color transformation[1],
inpainting masks[4,28,29] etc., which are summarized into
an attribute removal-and-restoration framework by Ye et
al. [1]. They argue that the network can learn more robust
features during the process of restoring the previously re-
moved attributes. Following this paradigm, we propose a
specific attribute removal-restoration task where the low-
frequency and color attributes are the main attributes to
be restored.
Compare the images. After the reconstruction, the
anomalies can be detected by comparing the original and
reconstructed images. Early comparing functions include
l2distance, structure similarity (SSIM) [30], etc. Further-
more, Zavrtanik et al. [4] introduce a multi-scale gradi-
ent map (MSGMS) anomaly evaluation function which
significantly boosts the performance. However, MSGMS
performs poorly on those low-frequency color anomalies.
Later, Zavrtanik et al. [31] further propose to use a sep-
arate discriminative network (DRAEM) which takes the
concatenation of the original and reconstructed images as
input and detects the anomalies via image segmentation.
While DRAEM achieves remarkable performance on the
MVTec AD, the additional discriminative network intro-
duces extra latent features and therefore makes the seg-
mentation results less interpretable. Similarly, the current
state-of-the-art reconstruction-based method OCR-GAN
[22] also leverage latent space features and combine them
with l1distance to detect anomalies. Differently, in this
paper, we still focus on hand-crafted anomaly score func-
tions, which are more interpretable and adjustable. Con-
cretely, we propose a new color comparing function and
combine it with the existing MSGMS function. The pro-
posed function can effectively detect various anomalies.
3. Methods
Our reconstruction framework is based on an UNet-type
encoder-decoder network with the corrupted grayscale
edge as input. Specifically, we first corrupted the original
image with certain noises; then we convert the corrupted
image into a grayscale edge; after that, we train a network
to reconstruct the original image from its corrupted edge;
finally, we discuss how to design the anomaly evaluation
function.
3.1. Get the corrupted edges
Our basic idea is to formulate an attribute removal-and-
restoration task that can be suitable for various industrial
anomaly detection scenarios. Specifically, we construct a
‘grayscale edge to RGB image’ task where we remove the
low frequency and color attributes in the original image
and train a network to restore them. This design is based
on two considerations. First, low-frequency and color con-
tents are general attributes in various images. We notice
that there also exist other tasks such as the restoration of
the geometrically transformed image [18,2527], However,
compared with our design, these methods are less general,
e.g., the above geometric transformation framework can-
not be applied to spatially invariant textures, while our
design can be applied to both texture and object images.
Second, preserving edge information enables the network
to better reconstruct the details in normal patterns, which
can effectively reduce the false positive rate in complex
normal areas. On the other hand, preserving the edges
may also lead to the model producing identity mappings
of the original high-frequency components. To avoid this,
we first corrupt the original image with certain noise.
We adopt the strategy proposed in [31] to generate sim-
ulated anomalies whose textures are from external tex-
ture dataset DTD [32] with the shapes of randomly gen-
erated Perlin noise. However, we observe that if only
use these out-of-distribution textures as pseudo anoma-
lies, the model cannot well distinguish the foreground and
background areas. This makes it difficult to detect struc-
tural defects caused by missing components. Therefore,
3
摘要:

Reconstructionfromedgeimagecombinedwithcolorandgradientdi erenceforindustrialsurfaceanomalydetectionTongkunLiua,BingLia,b,ZhuoZhaoa,,XiaoDua,BingkeJianga,LeqiGengaaStateKeyLaboratoryforManufacturingSystemEngineering,Xi'anJiaotongUniversity,No.99YanxiangRoad,YantaDistrict,710054,Xi'an,Shaanxi,Chinab...

展开>> 收起<<
Reconstruction from edge image combined with color and gradient difference for industrial surface anomaly detection.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:7.5MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注