Hierarchical Approach to Conditional Random Fields for System Anomaly Detection.docx

2025-05-06 1 0 631.22KB 8 页 10玖币

侵权投诉

A Hierarchical Approach to Conditional Random

Fields for System Anomaly Detection

Srishti Mishra

PES University

Tvarita Jain

PES University

Dr. Dinkar Sitaram

Professor,PES University

Abstract—Anomaly detection or outlier detection to recognize

unusual or rare events in large scale systems in a time sensitive

manner is critical in many industries, eg. bank fraud, glitches

in critical systems, medical alerts, malfunctioning equipment

etc. Large-scale systems often grow in size and complexity over

time, and anomaly detection algorithms need to adapt to the

changing structures. A hierarchical approach can take

advantage of the implicit relationships in complex systems and

capture anomalies based on context. Furthermore, the features

in complex systems may vary drastically in data distribution,

capturing different aspects from multiple data sources, and

when put together can provide a more complete view of the

entire system. Two main datasets are considered, the first

consisting of varied system metrics from machines running on

a cloud service, and the second of application metrics from a

complex distributed software system with inherent hierarchies

and interconnections amongst numerous system nodes.

Comparing algorithms running in a hierarchical manner,

across the Changepoint-based PELT algorithm, cognitive

learning-based Hierarchical Temporal Memory algorithms,

Support Vector Machines and Conditional Random Fields

provides a basis for proposing a Hierarchical Global-Local

Conditional Random Field approach to accurately capture

anomalies in complex systems, and across various features.

Hierarchical algorithms can learn both the intricacies of

lower-level or specific features, and utilize these in the global

abstracted representation to detect anomalous patterns

robustly across multi-source feature data and distributed

systems. A graphical network analysis on complex systems can

further fine-tune datasets to mine relationships based on

available features, which can benefit hierarchical models.

Furthermore, hierarchical solutions can adapt well to changes

at a localized level, learning on new data and changing

environments when parts of a system are over-hauled, and

translate these learnings to a global view of the system over

time.

Keywords—anomaly detection, hierarchical learning, complex

systems,, conditional random fields, enterprise systems,

hierarchical conditional random fields

I. INTRODUCTION

Anomalous behavior is inherent to large-scale, enterprise

software systems which power a variety of industries from

security and IT to energy and healthcare. Anomalies are

instances when the behavior of the system is significantly

different from the usual and may indicate a problem or

unusual activities in the system. In order to predict

anomalies, specific patterns of behavior leading up to the

anomaly must be identified and used for future prediction.

Anomaly detection on real-time streaming data from

systems enables corrective action to be taken in critical

scenarios thereby saving time, money and personhours.

With the advent of the cloud, these software systems span

multiple machines and networks within large-scale data

centers, logging large volumes of real-time performance

data. The data is often agglomerated from different

components of the system at regular intervals taking the

form of a time series dataset. The immense amount of data

poses a challenge for humans, even experts, to identify

anomalies early on.

Turning to machine learning approaches, sequence learning

models play an important role in identifying patterns in the

dataset which lead to an anomaly. In this paper, hierarchical

machine learning approaches are explored to address the

large-scale of data originating from multiple sources. An

especially interesting hierarchical model, Hierarchical

Temporal Memory, lies in the field of cognitive learning

algorithms and is compared to hierarchical approaches using

traditional machine learning and sequence learning models.

Applying hierarchical learning to the problem of large-scale

multi-source datasets from enterprise systems, a novel

hierarchical approach using a Local-Global Conditional

Random Field (CRF) model is proposed as a solution for

anomaly detection. Conditional Random Fields are robust

for sequence learning and the Local-Global method allows

the model to locally learn the idiosyncrasies of each data

source as well as globally generalize across the sources and

identify anomalies in the system.

The rest of the paper is organized as follows. Section II

describes related work in the field of anomaly detection. In

Section III, current approaches and models are discussed

and compared. Section IV focuses on the proposed

approach; detailing the Global-Local CRF model and the

motivation behind it. Section V discusses the experimental

approach with the nature of the dataset and the evaluation

metrics employed. The results of the models are evaluated

and compared with the proposed approach, augmented with

the network analysis, in Section VI. Section VII concludes

with the outcomes of the proposed approach and significant

findings from the comparative study.

II. RELATED WORK

Previous approaches to anomaly detection include both

supervised methods, such as support vector machines,

regression models, decision trees etc [1,2,3] as well as

unsupervised (eg. clustering), however these are yet to be

adapted to multi-source, real-time time series datasets.

Dimensionality-based methods such as variants of PCA

[4,5] are primarily used for high-dimensional, multivariate

data streams that can be projected onto a low dimensional

space. However, these are restrictive and have strict data

constraints, which hinders its adoption in real-world

anomaly detection scenarios. Statistical methods such as

multivariate statistics [6], Bayesian analysis [7], and

frequency and simple significance tests [8] have also been

used for anomaly detection. These methods, however,

cannot adapt well across multi-source datasets and results

get worse as the dataset becomes larger.

Time series analysis such as the ARIMA (Autoregressive

integrated moving average) method uses a combination of

autoregressive and moving average to model seasonality and

predict values [9, 31].

Sequence learning classifiers, such as LSTMs/RNNs models

used by Fabian Huch et. all [34] for anomaly detection on

imbalanced datasets can identify patterns leading up to an

anomaly in the case of predictive anomaly detection.

Hierarchical Hidden Markov Models (HHMM) [18] are

another sequence prediction model based on nested HMMs

to learn a hierarchy of features, where deeply nested markov

models learn low-level features and send back predictions

and sequences to higher layers, which learn higher-level

features. However, as the order of the HMM increases, it

gets computationally expensive and is feasible for mostly

short-term dependencies.

A few other approaches use statistical methods to extract

correlations from the data and events generated from

large-scale cloud systems, such as the novel

regression-based correlation analysis technique by M.

Farshchi et. all [32]. This regression technique uses highly

correlated clusters of logs with system metrics to predict

expected system values and an observation that significantly

deviates from the prediction is classified as an anomaly. D.

Sun et. all [33], describe extracting specific features from

the system data over a longer detection window and then

running a classifier.

In the following sections, key approaches including SVM

models, HTM models and changepoint algorithms are

discussed with their results, as part of the comparative study

with Hierarchical Conditional Random Fields.

III. CURRENT APPROACHES

A. PELT

Another method to detect anomalies is to use changepoint

detection algorithms [9] which can identify the occurrence

of an anomaly or the start of an anomalous sequence. Due to

its reliance on statistical measures of the data, it is

computationally expensive to identify changepoints as the

time series gets longer. An efficient variant known as PELT

changepoint detection [10] has improved the performance of

the algorithm, however minimizing false positives remains a

challenge. Pruned Exact Linear Time (PELT) is a change

point detection algorithm belonging to the time series

family. This approach graphically identifies data points that

have a significant statistical change and labels them as

change points.

Mathematically speaking, consider data points from

z1,z2……,zn,if a change point exists at say zt then there is

some statistical change between {z1,....,zt} and

{zt+1,.....zn}. The number and position of the points at

which the mean changes is inferred. Changepoint detection

analyses if the observed results are different and as such it is

natural to compare model fits with changepoints to those

without. One approach is to use a Likelihood Ratio Test. The

likelihood of the model including a change will always

provide an improvement over the model with no change,

additional parameters improve the fit. If a changepoint is

identified, it’s position is estimated as

t = argmax{l(z1:zt) +l(zt+1:zn)-l(z1:zn)}

By eye there is often an obvious changepoint at (or by) a

time-point s. This means that for any time T ahead of s the

most recent change point cannot be at a time t seen before s.

This shows that the search step could be pruned and hence

avoid searching over any t seen before s. If many t times are

pruned, excluded from the minimization then computational

time will be drastically reduced and the algorithm becomes

very efficient. However the only downside of this change

point detection algorithm is that it assumes that any change

that is not recent should be pruned. It can be proved that,

under certain regularity conditions, the expected

computational complexity will be O(n). The most important

condition is that the number of changepoints increases

linearly with n.

B. Hierarchical Temporal Memory

Hierarchical temporal memory (HTM) models are a

relatively new development in the field of sequence learning

and aim to resemble cortical algorithms found in the human

neocortex. They are unsupervised models which can learn

from multiple inputs and be trained on streaming data,

similar to how the human brain processes information. The

HTM algorithm adopts several concepts of learning such as,

a) Hierarchy of Regions

Neurons are arranged in columnar structures across

hierarchy of 6 layers and the dendritic connections between

the neurons in each layer ensures that information flows up

from the lower layers, composed primarily of sensory inputs

and predictions flow downwards from the higher layers to

the lower layers.

a) Simplified Perceptron b) Biological Pyramidal Neuron

c) HTM Pyramidal Neuron

Fig. 1. Comparison of ANN Perceptron, Biological Pyramidal Neuron

and HTM Pyramidal Neuron

b) Pyramidal Neuron Structure

The basic processing unit of the HTM algorithm models a

pyramidal neuron [19, 20] which consists of dendritic

segments with numerous synapses arranged along dendrites,

as seen in Fig. 1 b). Since the perceptron neuron used by

ANNs is a simplified model with no dendrites and a few

highly precise synapses, (described in Fig. 1 a) and not

suitable for temporal sequence learning, other sequence

learning models such as RNN/LSTMs [insert reference] too

use a special memory cell which is an improvement over the

basic perceptron neuron of ANN models.

c) Sparsity and Input Encoding

Sparsity of neural activity in the human neocortex results in

2% or less neurons being active at any point of time. HTM

theory uses sparse representations [23] during encoding

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AHierarchicalApproachtoConditionalRandomFieldsforSystemAnomalyDetectionSrishtiMishraPESUniversityTvaritaJainPESUniversityDr.DinkarSitaramProfessor,PESUniversityAbstract—Anomalydetectionoroutlierdetectiontorecognizeunusualorrareeventsinlargescalesystemsinatimesensitivemanneriscriticalinmanyindustries...

展开>> 收起<<

Hierarchical Approach to Conditional Random Fields for System Anomaly Detection.docx.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Hierarchical Approach to Conditional Random Fields for System Anomaly Detection.docx

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: