DEGAN T IMESERIES ANOMALY DETECTION USING GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND DENSITY ESTIMATION_2

2025-05-06 0 0 3.07MB 17 页 10玖币

侵权投诉

DEGAN: TIME SERIES ANOMALY DETECTION USING

GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND

DENSITY ESTIMATION

Yueyan Gu

Ph.D. Student

Department of Civil and Environmental Engineering

Virginia Tech

Blacksburg, VA, USA

yueyangu@vt.edu

Farrokh Jazizadeh

Associate Professor

Department of Civil and Environmental Engineering

Virginia Tech

Blacksburg, VA, USA

jazizade@vt.edu

Corresponding Author

ABSTRACT

Developing efﬁcient anomaly detection techniques is important to maintain service quality and

provide early alarms in industrial systems. Many related efforts have been made based on time

series data, a ubiquitous form for measuring, recording and analysing the dynamics of different

processes. Unsupervised methods have been more popular in anomaly detection due to challenges

associated with labeled datasets. Generative neural network methods are one class of the unsupervised

approaches that are achieving increasing attention in recent years. In this paper, we have proposed an

unsupervised Generative Adversarial Network (GAN)-based anomaly detection framework, DEGAN.

It relies solely on normal time series data as input to train a well-conﬁgured discriminator (

) into

a standalone anomaly predictor. In this framework, time series data is processed by the sliding

window method. Expected normal patterns in data are leveraged to develop a generator (

) capable of

generating normal data patterns. Normal data is also utilized in hyperparameter tuning and

model

selection steps. Validated

models are then extracted and applied to evaluate unseen (testing) time

series and identify patterns that have anomalous characteristics. Kernel density estimation (KDE) is

applied to data points that are likely to be anomalous to generate probability density functions on the

testing time series. The segments of the testing time series with the highest relative probabilities are

detected as anomalies. To evaluate the performance of the framework, we used a case study dataset

of univariate acceleration time series for ﬁve miles of a Class I railroad track. We implemented the

proposed approach to detect the real anomalous observations identiﬁed by operators. The results show

that leveraging the framework with a convolutional neural network

architecture results in average

best recall and precision of 80% and 86%, respectively, which demonstrates that a well-trained

standalone

model has the potential to be a reliable anomaly detector. Moreover, the inﬂuence of

GAN hyperparameters, GAN architectures, sliding window sizes, clustering of time series, and model

validation with labeled/unlabeled data were also investigated to provide insight into their impact.

Keywords

Unsupervised learning

Anomaly detection

Defect detection

Generative Adversarial Network

Density

estimation ·Railroad

1 Introduction

Anomaly detection, namely the determination of data patterns deviating from the normal state of a system, is a classic

problem in machine learning, where the quantity of abnormal and normal patterns is usually highly imbalanced. Popular

application scenarios include but are not limited to ﬁnancial systems [

], medical diagnosis [

], building/infrastructure

systems [

] and network security [

]. Motion and dynamics of industrial devices, weather and climate, personal and

social economic activities, are typically represented by time series, a succession of data points over time. Therefore,

arXiv:2210.02449v1 [cs.LG] 5 Oct 2022

time series anomaly detection is of signiﬁcance in managing the overall service quality of a variety of systems and the

Internet of Things [

], such as building/infrastructure systems including utility consumption (water, electricity, gas),

structural health monitoring (deﬂection, displacement), indoor environment (temperature, air quality), etc. Although we

have known the signiﬁcance of anomaly detection for time series data, it remains a challenge due to its complicated

temporal dependence and stochastic nature [6].

Existing time series anomaly detection methods can generally be divided into the following categories: statistical

approaches (e.g., Autoregressive Model, AutoRegressive Integrated Moving Average (ARIMA) Model, Simple Ex-

ponential Smoothing), classical machine learning approaches(e.g., k-means clustering, isolation forest, one-class

support vector machines, extreme gradient boosting), neural network approaches (e.g., Multiple Layer Perceptron,

Convolutional Neural Networks, Long Short Term Memory network), and generative methods (e.g., Autoencoders,

Generative Adversarial Networks) [

]. In addition, depending on how labeled data is used, anomaly detection methods

can also be categorized into supervised and unsupervised approaches. Supervised approaches need labels for the input

time series to differentiate anomalous and normal observations, while unsupervised anomaly detection methods depend

solely on unlabeled data. Owing to the inefﬁciency of label, time series anomaly detection is more common to be

dealt with as an machine learning problem in an unsupervised paradigm. [

]. With the increasing computing power,

more advanced machine learning approaches have emerged. Generative Adversarial Networks (GAN), since introduced

in 2014 [

], have gained much popularity in image generation, data augmentation, and image-to-image translation

areas. In recent years, it has also been applied in anomaly detection, mainly by relying on loss scores as metrics

for anomalous pattern recognition. However, only limited research has been conducted on studying the potential of

standalone discriminator (

) models in GANs for anomaly detection. To this end, we have proposed an unsupervised

method of GAN-based density estimation for time series, DEGAN, by learning the characteristics of the normal data

patterns through training and validating GAN models on normal data observations. The

model is then extracted to

identify patterns that have anomalous features. Kernel density estimation (KDE) is applied to generate probability

density functions on the testing time series.

The highlights of DEGAN framework are as follows: (1) it relies on a well-trained

model as a standalone anomaly

detection model; (2) it doesn’t need labeled data for training and the optimal

model selection; and (3) it can reach a

relatively high recall and meanwhile well balance recall and precision. We have evaluated DEGAN using a real-world

case study, i.e., detecting anomalous observations on a Class I railroad track inspection dataset.

The rest of this paper is organized as follows. Section 2 introduces related research work on time series anomaly

detection methods and Generative Adversarial Networks. In Section 3, we presented the DEGAN framework and

elaborated on different framework design considerations. In Section 4, the case study has been introduced and the

performance of the framework has been evaluated. In doing so, we have discussed the adopted performance metrics, as

well as inﬂuencing factors that affect the overall performance. Finally, in Section 5, the main contributions and results

of this paper are concluded.

2 Related work

Anomaly detection has been a popular research direction because of its value in monitoring conditions of different

systems and providing timely alarms. How to choose a speciﬁc anomaly detection method usually depends on the type

of data. Given our focus on time series anomaly detection and GAN-based frameworks, the scope of the review has

been narrowed down to cover time-series-based and GAN-based anomaly detection techniques.

Time series anomaly detection is usually carried out in an unsupervised paradigm and could be challenging because

of its noise and temporal dependencies [

]. As early as 1977, a statistical approach was proposed by Tukey [

] to

detect anomalies on time series. Meanwhile, as noticeable progress has been achieved in developing machine learning

approaches in the past few decades, many of them have been applied to anomaly detection problems. For example,

k-means clustering [

] is an algorithm that can be executed on the sub-sequences of the time series dataset, which

converges to

centroids. The distance from a new testing sequence to its nearest centroid could be evaluated to identify

the error. An anomaly can be reported when the corresponding error is higher than a preset threshold. Along the same

line of distance-based techniques, in 2003, Ma et al. [

] utilized One-Class Support Vector Machines (OC-SVM) to

detect novelties (anomalies) in time series as outliers of the normal distribution, where the vectors were converted into a

projected space. In 2012, Liu et al. [

] proposed Isolation Forest (iForest), which isolates anomalies using binary trees,

without conforming to the normal distribution.

With increasing computing power, deep learning approaches are catching more attention in the past decade. These

methods generally detect anomalies by comparing the new object with the normal distributions predicted based on

given history data. Long Short Term Memory (LSTM) networks have been known as a useful tool for learning the

longer-term pattern contained in sequences. In 2015, Malhotra et al. [

] demonstrated stacked LSTM networks’ use

for multiple time series anomaly detection scenarios such as ECG, power demand and multi-sensor engine, where a

network is trained on normal data and employed in the prediction for future steps. In 2018, Munir et al. [

] proposed

DeepAnt, a Convolutional Neural Networks (CNN) architecture, to predict expected patterns in a short future horizon.

Using this CNN model, non-conforming patterns in the data are detected as anomalies.

Although anomaly detection methods usually cater to speciﬁc problems, GAN, as an efﬁcient tool applied to image

anomaly detection [

], has provided great inspiration for anomaly detection in time series. Sun et al. [

] have

characterized it to be similar to the decision-making process of human beings:

is responsible for learning from

previous data and

functions as a relatively independent anomaly detector referring to

’s knowledge of previous

experience. In existing research and studies, GAN has facilitated anomaly detection of non-graphic data in the following

two aspects:

•Supervised approaches with oversampling of abnormal data:

GAN is widely used to generate convincing

synthetic data in various scenarios, which could be utilized in alleviating the challenge of imbalanced data via

creating synthetic anomalies [

]. Intuitively, we can convert time series to images, thus treating non-graphic

data oversampling as an image generation problem. Salem et al.[

] converted integer-based intrusion data

into images and utilized a Cycle-GAN to oversample the anomalies, and their study proved that, with the aid

of GAN as a supplementary oversampling tool, the anomaly detection performance has been improved. This

could be speciﬁcally beneﬁcial for an inherently imbalanced dataset.

•Unsupervised approaches via deﬁning an anomaly score with Gand/or Dloss: G

’s training loss is a

measurement of the residuals between the normal/real input data and the new samples generated (reconstructed)

, while

’s training loss represents the

’s ability to distinguish real/normal from fake/abnormal. By

combining both losses,

and

could be both leveraged in constructing an anomaly score to detect a sequence

with anomalous patterns. Li et al. [

] proposed this idea in 2019 and applied it in unsupervised multivariate

time series anomaly detection with LSTM-RNN as the

and

’s base model. Similar idea has also been

implemented in [

],[

], [

] and [

]. The multivariate time series could also be transformed into 2D images

by calculating distance matrices, which turns the problem into image anomaly detection with similarly deﬁned

GAN-based anomaly scores [

]. Moreover, some research efforts have also been made based only on

loss

[

] [

]. Overall, previous methods for GAN-based time series anomaly detection focus either on the

or on the combination of both the

and the

, while only few efforts [

] [

] have been made to leverage

the potential of an independent D, which relies solely on D’s classiﬁcation result.

In this paper, we focus on developing a reliable framework for time series anomaly detection utilizing a standalone

trained

model of GAN, with the aid of using a sliding window for feature extraction, data characterization (normal

data generation and time series pattern clustering) and density estimation to quantify the probability of anomalous data

patterns. Motivated by GANomaly framework [

], which addresses anomaly detection problems in the computer

vision domain with only normal images as input, our

solely learns the distributions of normal patterns and enables

the

to acquire the knowledge to distinguish fake (abnormal) from real (normal) during the training process. Then

the better-trained

is validated and identiﬁed to be used in conjunction with kernel density estimation. The proposed

framework has been evaluated on a real-world dataset collected through standard procedures of railroad inspection with

available labels based on a meticulous assessment by operators in real time.

3 Methodology

The DEGAN framework centers around repeated time series acquired to monitor the temporal variation in the operation

of a given system. These time series reﬂect the performance of the system and the goal is to identify when/where an

anomalous event is observed. Fig. 1 shows the overall process of using repeated time series data for training, validation

and testing in DEGAN, where

T SA

is training data,

T SB

is validation data and

T SC

is testing data. All time series are

processed for subsequence extraction using a sliding window with a length of

. The algorithm relies on two segments

of benchmark time series with no observed anomalies (clean) (

T SA

and

T SB

) that reﬂect the normal behavior of

the system, to train and conﬁgure the anomaly detection model. Grid search of GAN hyperparameters (

’s and

’s

learning rates) is ﬁrst carried out on

T SB

. Then, the model is trained on

T SA

with those learning rates and validated on

T SB

along training (to decide the best epoch to stop training). The best standalone

model is then used as anomaly

detector on new time series (e.g., T SC).

The DEGAN framework is depicted in Fig. 2, which includes three main components: GAN training,

model selection,

and probabilistic anomaly detection. These components have been described in the following subsections:

...

Window length

1st Window ith Window Nth Window

Window length

N-1

Original data

(L,1)

Extracted data frame

(N, Window length)

Benchmark

Normal TSA

Validation

Normal TSB

Testing

Anomalous TSC

Sliding window

①Model training on TSA

with best hyperparameters

obtained from TSB

②Model validation on

TSBalong training on TSA

③Best model Testing on TSC

Training

Testing

Figure 1: Time series involved in DEGAN

3.1 GAN training

3.1.1 Data pre-processing

As noted, all the time series are processed into subsequences (continuous segments) using a sliding window method.

The trained

model then uses a sliding window search to identify anomalous patterns. By applying the sliding window

method, the time series is processed into a feature matrix as shown in Eq. (1) and Fig. 3 - i.e., a time series with the

shape of (L,1) is transformed into a 2D-tensor with the shape of

(N, wl)

, where

is the total number of feature vectors.

The sliding window length

is an important hyperparameter in this framework as it has been further discussed in

Section 4.3.2. Moreover, each subsequence is processed using zero-mean normalization (see Eq. (2)).

W:= (W1, W2, ..., WN)T= ((x1, ..., xwl),(x2, ..., xwl+1), ..., (xN, ..., xN+wl−1))T(1)

W:= (W1−W1),(W2−W2), ..., (WN−WN)T(2)

3.1.2 The GAN architecture

A GAN network is made up of two neural networks, a generator

and a discriminator

is responsible for

generating synthetic (fake) data points (i.e., time series subsequences) using a random signal as input, while the

attempts to distinguish them from the real ones (i.e., the training set). The training process is a two-player minmax

game as reﬂected in Eq. (3).

min

Gmax

DV(D, G) = εx∼pdata (x)[logD(x)] + εz∼pz(Z)[log(1 −D(G(z)))] (3)

In Eq.(3), xfollows the distribution of the input normal while zfollows a random distribution.

The architectures of

and

employed in our framework are illustrated in Fig. 4 and summarized in Table 1. We

adopted a two-layer Dense neural network as the base model of

. The input layer is a 1d-tensor of random values

drawn from a ﬁxed standard Gaussian distribution ranging between 0 and 1 with a dimension of 128 (although it could

be a different size). The two fully connected layers are followed by a Tanh activation layer. For the

, we employed a

1d-convolutional model, CNN-

, which consists of one convolutional layer (Conv1D) and two fully connected layers.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DEGAN:TIMESERIESANOMALYDETECTIONUSINGGENERATIVEADVERSARIALNETWORKDISCRIMINATORSANDDENSITYESTIMATIONYueyanGuPh.D.StudentDepartmentofCivilandEnvironmentalEngineeringVirginiaTechBlacksburg,VA,USAyueyangu@vt.eduFarrokhJazizadehAssociateProfessorDepartmentofCivilandEnvironmentalEngineeringVirginiaTechBla...

展开>> 收起<<

DEGAN T IMESERIES ANOMALY DETECTION USING GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND DENSITY ESTIMATION_2.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

DEGAN T IMESERIES ANOMALY DETECTION USING GENERATIVE ADVERSARIAL NETWORK DISCRIMINATORS AND DENSITY ESTIMATION_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: