DEGAN T IMESERIES ANOMALY DETECTION USING GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND DENSITY ESTIMATION_2

2025-05-06 0 0 3.07MB 17 页 10玖币
侵权投诉
DEGAN: TIME SERIES ANOMALY DETECTION USING
GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND
DENSITY ESTIMATION
Yueyan Gu
Ph.D. Student
Department of Civil and Environmental Engineering
Virginia Tech
Blacksburg, VA, USA
yueyangu@vt.edu
Farrokh Jazizadeh
Associate Professor
Department of Civil and Environmental Engineering
Virginia Tech
Blacksburg, VA, USA
jazizade@vt.edu
Corresponding Author
ABSTRACT
Developing efficient anomaly detection techniques is important to maintain service quality and
provide early alarms in industrial systems. Many related efforts have been made based on time
series data, a ubiquitous form for measuring, recording and analysing the dynamics of different
processes. Unsupervised methods have been more popular in anomaly detection due to challenges
associated with labeled datasets. Generative neural network methods are one class of the unsupervised
approaches that are achieving increasing attention in recent years. In this paper, we have proposed an
unsupervised Generative Adversarial Network (GAN)-based anomaly detection framework, DEGAN.
It relies solely on normal time series data as input to train a well-configured discriminator (
D
) into
a standalone anomaly predictor. In this framework, time series data is processed by the sliding
window method. Expected normal patterns in data are leveraged to develop a generator (
G
) capable of
generating normal data patterns. Normal data is also utilized in hyperparameter tuning and
D
model
selection steps. Validated
D
models are then extracted and applied to evaluate unseen (testing) time
series and identify patterns that have anomalous characteristics. Kernel density estimation (KDE) is
applied to data points that are likely to be anomalous to generate probability density functions on the
testing time series. The segments of the testing time series with the highest relative probabilities are
detected as anomalies. To evaluate the performance of the framework, we used a case study dataset
of univariate acceleration time series for five miles of a Class I railroad track. We implemented the
proposed approach to detect the real anomalous observations identified by operators. The results show
that leveraging the framework with a convolutional neural network
D
architecture results in average
best recall and precision of 80% and 86%, respectively, which demonstrates that a well-trained
standalone
D
model has the potential to be a reliable anomaly detector. Moreover, the influence of
GAN hyperparameters, GAN architectures, sliding window sizes, clustering of time series, and model
validation with labeled/unlabeled data were also investigated to provide insight into their impact.
Keywords
Unsupervised learning
·
Anomaly detection
·
Defect detection
·
Generative Adversarial Network
·
Density
estimation ·Railroad
1 Introduction
Anomaly detection, namely the determination of data patterns deviating from the normal state of a system, is a classic
problem in machine learning, where the quantity of abnormal and normal patterns is usually highly imbalanced. Popular
application scenarios include but are not limited to financial systems [
1
], medical diagnosis [
2
], building/infrastructure
systems [
3
] and network security [
4
]. Motion and dynamics of industrial devices, weather and climate, personal and
social economic activities, are typically represented by time series, a succession of data points over time. Therefore,
arXiv:2210.02449v1 [cs.LG] 5 Oct 2022
time series anomaly detection is of significance in managing the overall service quality of a variety of systems and the
Internet of Things [
5
], such as building/infrastructure systems including utility consumption (water, electricity, gas),
structural health monitoring (deflection, displacement), indoor environment (temperature, air quality), etc. Although we
have known the significance of anomaly detection for time series data, it remains a challenge due to its complicated
temporal dependence and stochastic nature [6].
Existing time series anomaly detection methods can generally be divided into the following categories: statistical
approaches (e.g., Autoregressive Model, AutoRegressive Integrated Moving Average (ARIMA) Model, Simple Ex-
ponential Smoothing), classical machine learning approaches(e.g., k-means clustering, isolation forest, one-class
support vector machines, extreme gradient boosting), neural network approaches (e.g., Multiple Layer Perceptron,
Convolutional Neural Networks, Long Short Term Memory network), and generative methods (e.g., Autoencoders,
Generative Adversarial Networks) [
7
]. In addition, depending on how labeled data is used, anomaly detection methods
can also be categorized into supervised and unsupervised approaches. Supervised approaches need labels for the input
time series to differentiate anomalous and normal observations, while unsupervised anomaly detection methods depend
solely on unlabeled data. Owing to the inefficiency of label, time series anomaly detection is more common to be
dealt with as an machine learning problem in an unsupervised paradigm. [
8
]. With the increasing computing power,
more advanced machine learning approaches have emerged. Generative Adversarial Networks (GAN), since introduced
in 2014 [
9
], have gained much popularity in image generation, data augmentation, and image-to-image translation
areas. In recent years, it has also been applied in anomaly detection, mainly by relying on loss scores as metrics
for anomalous pattern recognition. However, only limited research has been conducted on studying the potential of
standalone discriminator (
D
) models in GANs for anomaly detection. To this end, we have proposed an unsupervised
method of GAN-based density estimation for time series, DEGAN, by learning the characteristics of the normal data
patterns through training and validating GAN models on normal data observations. The
D
model is then extracted to
identify patterns that have anomalous features. Kernel density estimation (KDE) is applied to generate probability
density functions on the testing time series.
The highlights of DEGAN framework are as follows: (1) it relies on a well-trained
D
model as a standalone anomaly
detection model; (2) it doesn’t need labeled data for training and the optimal
D
model selection; and (3) it can reach a
relatively high recall and meanwhile well balance recall and precision. We have evaluated DEGAN using a real-world
case study, i.e., detecting anomalous observations on a Class I railroad track inspection dataset.
The rest of this paper is organized as follows. Section 2 introduces related research work on time series anomaly
detection methods and Generative Adversarial Networks. In Section 3, we presented the DEGAN framework and
elaborated on different framework design considerations. In Section 4, the case study has been introduced and the
performance of the framework has been evaluated. In doing so, we have discussed the adopted performance metrics, as
well as influencing factors that affect the overall performance. Finally, in Section 5, the main contributions and results
of this paper are concluded.
2 Related work
Anomaly detection has been a popular research direction because of its value in monitoring conditions of different
systems and providing timely alarms. How to choose a specific anomaly detection method usually depends on the type
of data. Given our focus on time series anomaly detection and GAN-based frameworks, the scope of the review has
been narrowed down to cover time-series-based and GAN-based anomaly detection techniques.
Time series anomaly detection is usually carried out in an unsupervised paradigm and could be challenging because
of its noise and temporal dependencies [
6
]. As early as 1977, a statistical approach was proposed by Tukey [
10
] to
detect anomalies on time series. Meanwhile, as noticeable progress has been achieved in developing machine learning
approaches in the past few decades, many of them have been applied to anomaly detection problems. For example,
k-means clustering [
11
] is an algorithm that can be executed on the sub-sequences of the time series dataset, which
converges to
k
centroids. The distance from a new testing sequence to its nearest centroid could be evaluated to identify
the error. An anomaly can be reported when the corresponding error is higher than a preset threshold. Along the same
line of distance-based techniques, in 2003, Ma et al. [
12
] utilized One-Class Support Vector Machines (OC-SVM) to
detect novelties (anomalies) in time series as outliers of the normal distribution, where the vectors were converted into a
projected space. In 2012, Liu et al. [
13
] proposed Isolation Forest (iForest), which isolates anomalies using binary trees,
without conforming to the normal distribution.
With increasing computing power, deep learning approaches are catching more attention in the past decade. These
methods generally detect anomalies by comparing the new object with the normal distributions predicted based on
given history data. Long Short Term Memory (LSTM) networks have been known as a useful tool for learning the
longer-term pattern contained in sequences. In 2015, Malhotra et al. [
14
] demonstrated stacked LSTM networks’ use
2
for multiple time series anomaly detection scenarios such as ECG, power demand and multi-sensor engine, where a
network is trained on normal data and employed in the prediction for future steps. In 2018, Munir et al. [
15
] proposed
DeepAnt, a Convolutional Neural Networks (CNN) architecture, to predict expected patterns in a short future horizon.
Using this CNN model, non-conforming patterns in the data are detected as anomalies.
Although anomaly detection methods usually cater to specific problems, GAN, as an efficient tool applied to image
anomaly detection [
16
], has provided great inspiration for anomaly detection in time series. Sun et al. [
17
] have
characterized it to be similar to the decision-making process of human beings:
G
is responsible for learning from
previous data and
D
functions as a relatively independent anomaly detector referring to
G
s knowledge of previous
experience. In existing research and studies, GAN has facilitated anomaly detection of non-graphic data in the following
two aspects:
Supervised approaches with oversampling of abnormal data:
GAN is widely used to generate convincing
synthetic data in various scenarios, which could be utilized in alleviating the challenge of imbalanced data via
creating synthetic anomalies [
18
]. Intuitively, we can convert time series to images, thus treating non-graphic
data oversampling as an image generation problem. Salem et al.[
19
] converted integer-based intrusion data
into images and utilized a Cycle-GAN to oversample the anomalies, and their study proved that, with the aid
of GAN as a supplementary oversampling tool, the anomaly detection performance has been improved. This
could be specifically beneficial for an inherently imbalanced dataset.
Unsupervised approaches via defining an anomaly score with Gand/or Dloss: G
s training loss is a
measurement of the residuals between the normal/real input data and the new samples generated (reconstructed)
by
G
, while
D
s training loss represents the
D
s ability to distinguish real/normal from fake/abnormal. By
combining both losses,
G
and
D
could be both leveraged in constructing an anomaly score to detect a sequence
with anomalous patterns. Li et al. [
20
] proposed this idea in 2019 and applied it in unsupervised multivariate
time series anomaly detection with LSTM-RNN as the
G
and
D
s base model. Similar idea has also been
implemented in [
21
],[
8
], [
22
] and [
23
]. The multivariate time series could also be transformed into 2D images
by calculating distance matrices, which turns the problem into image anomaly detection with similarly defined
GAN-based anomaly scores [
24
]. Moreover, some research efforts have also been made based only on
G
loss
[
25
] [
26
] [
22
]. Overall, previous methods for GAN-based time series anomaly detection focus either on the
G
or on the combination of both the
G
and the
D
, while only few efforts [
17
] [
22
] have been made to leverage
the potential of an independent D, which relies solely on Ds classification result.
In this paper, we focus on developing a reliable framework for time series anomaly detection utilizing a standalone
trained
D
model of GAN, with the aid of using a sliding window for feature extraction, data characterization (normal
data generation and time series pattern clustering) and density estimation to quantify the probability of anomalous data
patterns. Motivated by GANomaly framework [
27
], which addresses anomaly detection problems in the computer
vision domain with only normal images as input, our
G
solely learns the distributions of normal patterns and enables
the
D
to acquire the knowledge to distinguish fake (abnormal) from real (normal) during the training process. Then
the better-trained
D
is validated and identified to be used in conjunction with kernel density estimation. The proposed
framework has been evaluated on a real-world dataset collected through standard procedures of railroad inspection with
available labels based on a meticulous assessment by operators in real time.
3 Methodology
The DEGAN framework centers around repeated time series acquired to monitor the temporal variation in the operation
of a given system. These time series reflect the performance of the system and the goal is to identify when/where an
anomalous event is observed. Fig. 1 shows the overall process of using repeated time series data for training, validation
and testing in DEGAN, where
T SA
is training data,
T SB
is validation data and
T SC
is testing data. All time series are
processed for subsequence extraction using a sliding window with a length of
wl
. The algorithm relies on two segments
of benchmark time series with no observed anomalies (clean) (
T SA
and
T SB
) that reflect the normal behavior of
the system, to train and configure the anomaly detection model. Grid search of GAN hyperparameters (
G
s and
D
s
learning rates) is first carried out on
T SB
. Then, the model is trained on
T SA
with those learning rates and validated on
T SB
along training (to decide the best epoch to stop training). The best standalone
D
model is then used as anomaly
detector on new time series (e.g., T SC).
The DEGAN framework is depicted in Fig. 2, which includes three main components: GAN training,
D
model selection,
and probabilistic anomaly detection. These components have been described in the following subsections:
3
...
Window length
1st Window ith Window Nth Window
Window length
1
2
N-1
N
Original data
(L,1)
L
Extracted data frame
(N, Window length)
Benchmark
Normal TSA
Validation
Normal TSB
Testing
Anomalous TSC
Sliding window
Sliding window
Sliding window
Model training on TSA
with best hyperparameters
obtained from TSB
Model validation on
TSBalong training on TSA
Best model Testing on TSC
Training
Testing
Figure 1: Time series involved in DEGAN
3.1 GAN training
3.1.1 Data pre-processing
As noted, all the time series are processed into subsequences (continuous segments) using a sliding window method.
The trained
D
model then uses a sliding window search to identify anomalous patterns. By applying the sliding window
method, the time series is processed into a feature matrix as shown in Eq. (1) and Fig. 3 - i.e., a time series with the
shape of (L,1) is transformed into a 2D-tensor with the shape of
(N, wl)
, where
N
is the total number of feature vectors.
The sliding window length
wl
is an important hyperparameter in this framework as it has been further discussed in
Section 4.3.2. Moreover, each subsequence is processed using zero-mean normalization (see Eq. (2)).
W:= (W1, W2, ..., WN)T= ((x1, ..., xwl),(x2, ..., xwl+1), ..., (xN, ..., xN+wl1))T(1)
W:= (W1W1),(W2W2), ..., (WNWN)T(2)
3.1.2 The GAN architecture
A GAN network is made up of two neural networks, a generator
G
and a discriminator
D
.
G
is responsible for
generating synthetic (fake) data points (i.e., time series subsequences) using a random signal as input, while the
D
attempts to distinguish them from the real ones (i.e., the training set). The training process is a two-player minmax
game as reflected in Eq. (3).
min
Gmax
DV(D, G) = εxpdata (x)[logD(x)] + εzpz(Z)[log(1 D(G(z)))] (3)
In Eq.(3), xfollows the distribution of the input normal while zfollows a random distribution.
The architectures of
G
and
D
employed in our framework are illustrated in Fig. 4 and summarized in Table 1. We
adopted a two-layer Dense neural network as the base model of
G
. The input layer is a 1d-tensor of random values
drawn from a fixed standard Gaussian distribution ranging between 0 and 1 with a dimension of 128 (although it could
be a different size). The two fully connected layers are followed by a Tanh activation layer. For the
D
, we employed a
1d-convolutional model, CNN-
D
, which consists of one convolutional layer (Conv1D) and two fully connected layers.
4
摘要:

DEGAN:TIMESERIESANOMALYDETECTIONUSINGGENERATIVEADVERSARIALNETWORKDISCRIMINATORSANDDENSITYESTIMATIONYueyanGuPh.D.StudentDepartmentofCivilandEnvironmentalEngineeringVirginiaTechBlacksburg,VA,USAyueyangu@vt.eduFarrokhJazizadehAssociateProfessorDepartmentofCivilandEnvironmentalEngineeringVirginiaTechBla...

展开>> 收起<<
DEGAN T IMESERIES ANOMALY DETECTION USING GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND DENSITY ESTIMATION_2.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:17 页 大小:3.07MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注