Hierarchical Approach to Conditional Random Fields for System Anomaly Detection.docx

2025-05-06 0 0 631.22KB 8 页 10玖币
侵权投诉
A Hierarchical Approach to Conditional Random
Fields for System Anomaly Detection
Srishti Mishra
PES University
Tvarita Jain
PES University
Dr. Dinkar Sitaram
Professor,PES University
Abstract—Anomaly detection or outlier detection to recognize
unusual or rare events in large scale systems in a time sensitive
manner is critical in many industries, eg. bank fraud, glitches
in critical systems, medical alerts, malfunctioning equipment
etc. Large-scale systems often grow in size and complexity over
time, and anomaly detection algorithms need to adapt to the
changing structures. A hierarchical approach can take
advantage of the implicit relationships in complex systems and
capture anomalies based on context. Furthermore, the features
in complex systems may vary drastically in data distribution,
capturing different aspects from multiple data sources, and
when put together can provide a more complete view of the
entire system. Two main datasets are considered, the first
consisting of varied system metrics from machines running on
a cloud service, and the second of application metrics from a
complex distributed software system with inherent hierarchies
and interconnections amongst numerous system nodes.
Comparing algorithms running in a hierarchical manner,
across the Changepoint-based PELT algorithm, cognitive
learning-based Hierarchical Temporal Memory algorithms,
Support Vector Machines and Conditional Random Fields
provides a basis for proposing a Hierarchical Global-Local
Conditional Random Field approach to accurately capture
anomalies in complex systems, and across various features.
Hierarchical algorithms can learn both the intricacies of
lower-level or specific features, and utilize these in the global
abstracted representation to detect anomalous patterns
robustly across multi-source feature data and distributed
systems. A graphical network analysis on complex systems can
further fine-tune datasets to mine relationships based on
available features, which can benefit hierarchical models.
Furthermore, hierarchical solutions can adapt well to changes
at a localized level, learning on new data and changing
environments when parts of a system are over-hauled, and
translate these learnings to a global view of the system over
time.
Keywords—anomaly detection, hierarchical learning, complex
systems,, conditional random fields, enterprise systems,
hierarchical conditional random fields
I. INTRODUCTION
Anomalous behavior is inherent to large-scale, enterprise
software systems which power a variety of industries from
security and IT to energy and healthcare. Anomalies are
instances when the behavior of the system is significantly
different from the usual and may indicate a problem or
unusual activities in the system. In order to predict
anomalies, specific patterns of behavior leading up to the
anomaly must be identified and used for future prediction.
Anomaly detection on real-time streaming data from
systems enables corrective action to be taken in critical
scenarios thereby saving time, money and personhours.
With the advent of the cloud, these software systems span
multiple machines and networks within large-scale data
centers, logging large volumes of real-time performance
data. The data is often agglomerated from different
components of the system at regular intervals taking the
form of a time series dataset. The immense amount of data
poses a challenge for humans, even experts, to identify
anomalies early on.
Turning to machine learning approaches, sequence learning
models play an important role in identifying patterns in the
dataset which lead to an anomaly. In this paper, hierarchical
machine learning approaches are explored to address the
large-scale of data originating from multiple sources. An
especially interesting hierarchical model, Hierarchical
Temporal Memory, lies in the field of cognitive learning
algorithms and is compared to hierarchical approaches using
traditional machine learning and sequence learning models.
Applying hierarchical learning to the problem of large-scale
multi-source datasets from enterprise systems, a novel
hierarchical approach using a Local-Global Conditional
Random Field (CRF) model is proposed as a solution for
anomaly detection. Conditional Random Fields are robust
for sequence learning and the Local-Global method allows
the model to locally learn the idiosyncrasies of each data
source as well as globally generalize across the sources and
identify anomalies in the system.
The rest of the paper is organized as follows. Section II
describes related work in the field of anomaly detection. In
Section III, current approaches and models are discussed
and compared. Section IV focuses on the proposed
approach; detailing the Global-Local CRF model and the
motivation behind it. Section V discusses the experimental
approach with the nature of the dataset and the evaluation
metrics employed. The results of the models are evaluated
and compared with the proposed approach, augmented with
the network analysis, in Section VI. Section VII concludes
with the outcomes of the proposed approach and significant
findings from the comparative study.
II. RELATED WORK
Previous approaches to anomaly detection include both
supervised methods, such as support vector machines,
regression models, decision trees etc [1,2,3] as well as
unsupervised (eg. clustering), however these are yet to be
adapted to multi-source, real-time time series datasets.
Dimensionality-based methods such as variants of PCA
[4,5] are primarily used for high-dimensional, multivariate
data streams that can be projected onto a low dimensional
space. However, these are restrictive and have strict data
constraints, which hinders its adoption in real-world
anomaly detection scenarios. Statistical methods such as
multivariate statistics [6], Bayesian analysis [7], and
frequency and simple significance tests [8] have also been
used for anomaly detection. These methods, however,
cannot adapt well across multi-source datasets and results
get worse as the dataset becomes larger.
Time series analysis such as the ARIMA (Autoregressive
integrated moving average) method uses a combination of
autoregressive and moving average to model seasonality and
predict values [9, 31].
Sequence learning classifiers, such as LSTMs/RNNs models
used by Fabian Huch et. all [34] for anomaly detection on
imbalanced datasets can identify patterns leading up to an
anomaly in the case of predictive anomaly detection.
Hierarchical Hidden Markov Models (HHMM) [18] are
another sequence prediction model based on nested HMMs
to learn a hierarchy of features, where deeply nested markov
models learn low-level features and send back predictions
and sequences to higher layers, which learn higher-level
features. However, as the order of the HMM increases, it
gets computationally expensive and is feasible for mostly
short-term dependencies.
A few other approaches use statistical methods to extract
correlations from the data and events generated from
large-scale cloud systems, such as the novel
regression-based correlation analysis technique by M.
Farshchi et. all [32]. This regression technique uses highly
correlated clusters of logs with system metrics to predict
expected system values and an observation that significantly
deviates from the prediction is classified as an anomaly. D.
Sun et. all [33], describe extracting specific features from
the system data over a longer detection window and then
running a classifier.
In the following sections, key approaches including SVM
models, HTM models and changepoint algorithms are
discussed with their results, as part of the comparative study
with Hierarchical Conditional Random Fields.
III. CURRENT APPROACHES
A. PELT
Another method to detect anomalies is to use changepoint
detection algorithms [9] which can identify the occurrence
of an anomaly or the start of an anomalous sequence. Due to
its reliance on statistical measures of the data, it is
computationally expensive to identify changepoints as the
time series gets longer. An efficient variant known as PELT
changepoint detection [10] has improved the performance of
the algorithm, however minimizing false positives remains a
challenge. Pruned Exact Linear Time (PELT) is a change
point detection algorithm belonging to the time series
family. This approach graphically identifies data points that
have a significant statistical change and labels them as
change points.
Mathematically speaking, consider data points from
z1,z2……,zn,if a change point exists at say zt then there is
some statistical change between {z1,....,zt} and
{zt+1,.....zn}. The number and position of the points at
which the mean changes is inferred. Changepoint detection
analyses if the observed results are different and as such it is
natural to compare model fits with changepoints to those
without. One approach is to use a Likelihood Ratio Test. The
likelihood of the model including a change will always
provide an improvement over the model with no change,
additional parameters improve the fit. If a changepoint is
identified, it’s position is estimated as
t = argmax{l(z1:zt) +l(zt+1:zn)-l(z1:zn)}
By eye there is often an obvious changepoint at (or by) a
time-point s. This means that for any time T ahead of s the
most recent change point cannot be at a time t seen before s.
This shows that the search step could be pruned and hence
avoid searching over any t seen before s. If many t times are
pruned, excluded from the minimization then computational
time will be drastically reduced and the algorithm becomes
very efficient. However the only downside of this change
point detection algorithm is that it assumes that any change
that is not recent should be pruned. It can be proved that,
under certain regularity conditions, the expected
computational complexity will be O(n). The most important
condition is that the number of changepoints increases
linearly with n.
B. Hierarchical Temporal Memory
Hierarchical temporal memory (HTM) models are a
relatively new development in the field of sequence learning
and aim to resemble cortical algorithms found in the human
neocortex. They are unsupervised models which can learn
from multiple inputs and be trained on streaming data,
similar to how the human brain processes information. The
HTM algorithm adopts several concepts of learning such as,
a) Hierarchy of Regions
Neurons are arranged in columnar structures across
hierarchy of 6 layers and the dendritic connections between
the neurons in each layer ensures that information flows up
from the lower layers, composed primarily of sensory inputs
and predictions flow downwards from the higher layers to
the lower layers.
a) Simplified Perceptron b) Biological Pyramidal Neuron
c) HTM Pyramidal Neuron
Fig. 1. Comparison of ANN Perceptron, Biological Pyramidal Neuron
and HTM Pyramidal Neuron
b) Pyramidal Neuron Structure
The basic processing unit of the HTM algorithm models a
pyramidal neuron [19, 20] which consists of dendritic
segments with numerous synapses arranged along dendrites,
as seen in Fig. 1 b). Since the perceptron neuron used by
ANNs is a simplified model with no dendrites and a few
highly precise synapses, (described in Fig. 1 a) and not
suitable for temporal sequence learning, other sequence
learning models such as RNN/LSTMs [insert reference] too
use a special memory cell which is an improvement over the
basic perceptron neuron of ANN models.
c) Sparsity and Input Encoding
Sparsity of neural activity in the human neocortex results in
2% or less neurons being active at any point of time. HTM
theory uses sparse representations [23] during encoding
摘要:

AHierarchicalApproachtoConditionalRandomFieldsforSystemAnomalyDetectionSrishtiMishraPESUniversityTvaritaJainPESUniversityDr.DinkarSitaramProfessor,PESUniversityAbstract—Anomalydetectionoroutlierdetectiontorecognizeunusualorrareeventsinlargescalesystemsinatimesensitivemanneriscriticalinmanyindustries...

展开>> 收起<<
Hierarchical Approach to Conditional Random Fields for System Anomaly Detection.docx.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:631.22KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注