Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization

2025-05-06 0 0 913.33KB 19 页 10玖币
侵权投诉
111
Generating Hidden Markov Models from Process Models
Through Nonnegative Tensor Factorization
ERIK W. SKAU, Information Sciences, Los Alamos National Laboratory, USA
ANDREW HOLLIS, Department of Statistics, North Carolina State University, USA
STEPHAN EIDENBENZ, Information Sciences, Los Alamos National Laboratory, USA
KIM Ø. RASMUSSEN, Theoretical Division, Los Alamos National Laboratory, USA
BOIAN S. ALEXANDROV, Theoretical Division, Los Alamos National Laboratory, USA
Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability
of production cycles, quick emergency response, and national security. Process monitoring allows users to
gauge the progress of an organization in an industrial process or predict the degradation or aging of machine
parts in processes taking place at a remote location. Similar to many data science applications, we usually
only have access to limited raw data, such as satellite imagery, short video clips, event logs, and signatures
captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of Subject Matter
Experts (SMEs) who are familiar with the actions of interest. SMEs provide expert knowledge of the essential
activities required for task completion and the resources necessary to carry out each of these activities. Various
process mining techniques have been developed for this type of analysis; typically such approaches combine
theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of
raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models
(as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via nonnegative tensor
factorization. Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative
matrix-tensor factorizations, and (d) custom model selection. To demonstrate our methodology and its abilities,
we apply it on simple synthetic and real world process models.
Additional Key Words and Phrases: Process modeling, Hidden Markov Models, and Nonnegative Tensor
Factorization with Model Selection
1 INTRODUCTION
Process modeling, which is also called process mining, has been developed to analyze complex
business enterprises that involve many people, activities, and resources to guide information systems
engineering. Process models typically obtain their structure from workow logs that describe past
events relating to the enterprise process and specications of how and which resources have been
used [3,70].
When we monitor a specic process in real time, its activities and their temporal sequence are
often not directly observable, and in this sense they remain hidden or latent. For instance, if we
are monitoring an industrial process taking place at a remote/inaccessible location (e.g., building
a industrial complex, such as an oil/liqueed gas terminal), we often have access to only a set of
observables or indicators that underlie the activity, not the activity itself. Additionally, observable
data is often scarce, as with remote sensing and event logs. Domain expert specication of the
sequence of activities and their mean durations is useful to augment scarce data which can be used
in statistical analysis.
Process mining requires a statistical framework capable of accommodating both domain expert
specications and scarce observational data. Given some series of observations from the process, this
statistical framework should allow us to predict what process activity was underway at the time of
Authors’ addresses: Erik W. Skau, Information Sciences, Los Alamos National Laboratory, USA, ewskau@lanl.gov; Andrew
Hollis, Department of Statistics, North Carolina State University, USA, anhollis@ncsu.edu; Stephan Eidenbenz, Information
Sciences, Los Alamos National Laboratory, USA, eidenben@lanl.gov; Kim Ø. Rasmussen, Theoretical Division, Los Alamos
National Laboratory, USA, kor@lanl.gov; Boian S. Alexandrov, Theoretical Division, Los Alamos National Laboratory, USA,
boian@lanl.gov.
arXiv:2210.01060v2 [cs.LG] 26 Apr 2024
111:2 Skau et al.
Theoretical
Process Model
HMM (not minimal)
Discrete Event
Simulation
Joint Probability
Matrix and Tensor
Construction
Coupled Nonnega-
tive Matrix-Tensor
Factorization with
Model Selection
Minimal HMM
Fig. 1. Flow chart illustrating the pipeline to construct a minimal HMM from a theoretical process model.
the observations and how long it will be before the process is complete. Additionally, the statistical
framework should be able to accurately evaluate the likelihood of competing domain expert process
models based on limited observations. To answer these questions the statistical framework should
describe the dynamics of the underlying activities (which are not directly observable) and how
these hidden activities are related to the available observational data.
One such statistical framework is the Hidden Markov Model (HMM) introduced by Baum et al.
[
12
]. HMMs are a broadly used method for modeling data sequences and have been successfully
applied in various elds, such as: signal processing, machine learning, speech recognition, hand-
written character recognition, and in many other elds, see e.g., [
2
,
11
,
20
,
47
,
50
,
51
,
54
,
64
,
73
]. One
of the well-known problems with HMMs is model selection, that is, how to estimate the optimal
number of its hidden states. The HMM topology and number of hidden states have to be known
prior to its utilization, and various heuristic rules have been proposed to estimate the HMM latent
structure, see e.g., [
15
]. Thus, the challenge inherent in using HMMs is that we must identify the
minimal number of hidden states induced by the process model. We can then estimate the HMM
parameters that describe the latent dynamics and the observational probabilities associated with
each of the hidden states.
In this paper, we present a new method for building a minimal HMM based on a theoretical
process model proposed by domain experts. Our method integrates: (a) theoretical process models,
(b) HMMs, (c) coupled nonnegative matrix-tensor factorizations, and (d) custom model selection.
The relative relations between these components can be seen in Figure 1.
In summary, our contributions are:
We demonstrate how to directly use the structure of a theoretical process model to build
an HMM, where combinations of the process model activities are the HMM hidden states.
One problem with this approach is the number of hidden states may not be minimal. Such
misspecication can lead to a poor HMM parameterization, erroneous HMM predictions,
and poor identiability.
We show how to build an HMM model via coupled nonnegative matrix-tensor factorization,
based on pairwise co-occurrence events [
21
,
26
] generated from discrete event simulations
[62] of a theoretical process model.
Using the fact that tensor factorizations (and more specically tensor ranks [
35
]) can be
used to derive the optimal (i.e., the minimal) number of HMM states [
10
,
69
], we utilize
here our new method for Nonnegative Matrix Factorization (NMF) and Nonnegative Tensor
Factorization (NTF) model selection [
5
,
6
], and use it to determine the number of the hidden
states based on the stability of the tensor decomposition.
Finally, we demonstrate the results of our new method on synthetic and real world examples.
Generating HMMs from Process Models Through Nonnegative Tensor Factorization 111:3
2 RELATED WORKS
In general, not much previous work exists that explores the connection between process modeling
and HMMs. An HMM was proposed to sequence clustering in process mining [
22
]. In [
18
], the
authors constructed an HMM for resource allocation, and used an HMM to identify the observations
based on the resources utilized by each activity. HMMs have also been used for the representation
of process mining of parallel business processes [
59
]. Similarly, HMMs were used for extraction
of activity information recorded in the bank event log, for calculating the fraud probabilities [
52
]
and for detection of anomalies in business processes [
78
]. The authors of [
29
] applied HMMs
for business process management mining for failure detection from event logs. In the area of
healthcare and medicine, HMMs were used for complex healthcare process analysis and detection
of hidden activities [
9
]. Recently it was demonstrated how to build a semi-Markov model based on
event sequences, [
31
], and to use it for automatic activity discovery and analysis. Finally, although
there are several works relating minimal HMMs and tensor factorization (see for example [
46
]
and references therein), to the best of our knowledge, our work is the rst one that explores the
connection between theoretical process modeling and tensor factorization. There are numerous
works highlighting the importance of accurate tensor factorizations and the diculty of nonnegative
tensor factorizations [
34
,
38
,
76
,
77
]. In this work we simultaneously decompose matrices and
tensors with a joint least squared objective function. This type of minimization objective is often
called coupled matrix-tensor factorization [
1
,
61
], though every instance requires slightly dierent
formulations as the factorization structures vary between applications.
One necessity of HMMs is the determination of the latent dimension. When using matrix or
tensor factorization to derive HMMs, this dimension is called the rank and is typically denoted
with "k". The rank is typically elusive in practical applications, demanding an estimation derived
exclusively from the data. Despite the absence of a polynomial-time algorithm to ascertain a tensor’s
rank - a task proven to be NP-hard [
30
] - various heuristic methodologies have been developed for
nonnegative rank estimation. One prominent technique is the core consistency diagnostic, known as
CORCONDIA [
16
]. This method evaluates deviations from a super-diagonal core-tensor, estimating
the number of components, k, where this deviation reaches a minimum. Despite its intuitive appeal
and widespread application, CORCONDIA lacks a solid theoretical foundation. Particularly for
nonnegative tensor factorization, it encounters challenges when contrasting a nonnegative Tucker
Decomposition with the corresponding nonnegative Canonical Polyadic Decomposition (CPD). For
the complications of these two nonnegative decompositions see [
4
]. An alternative strategy is the
Bayesian approach, incorporating Automatic Relevance Determination (ARD). Initially devised for
neural networks [
39
] and subsequently adapted for Bayesian PCA [
14
], and then to NMF [
44
,
67
],
and tensors [
43
], ARD is an alternative approach to identify the number of latent dimensions. Other
strategies employ diverse statistical criteria, including minimum description length [
37
], Akaike
Information Criterion (AIC) [
56
], and Bayesian Information Criterion (BIC) [
63
] with partial success.
Determining the optimal nonnegative model - that is, the nonnegative rank - remains a formidable
challenge, with varying approaches often leading to disparate solutions [
24
]. The methodology
employed in our study hinges on the well-posed nature of the low-rank approximation of the
nonnegative CPD, where the solutions are almost always unique [
49
]. We determine the number of
latent components, k, by analyzing the stability of the solutions from factorizations across a set
of plausible k values [
17
]. Our method demonstrated superior ecacy when tested on numerous
synthetic datasets with pre-established latent features [
45
]. Additionally, our approach leverages
the theorem that a unique NMF solution, for a matrix: X = WH, when X is subjected to minor noise
or perturbations, results in minimal disruptions to the factors W and H [36].
111:4 Skau et al.
Our method has seen successful application in various domains, including the decomposition of
the largest human cancer genome data [
8
], synthetic cancer data [
28
], microphase separation of
block copolymers, via Nonnegative CPD [
7
], analysis of international trade ows [
68
] and exabyte
synthetic data [
13
], via RESCAL, Nonnegative Tucker decompositions for images [
48
] as well as
for large set of trajectories of the reaction diusion equation [
74
], among others. Our methodology
has also been utilized for model selection in Boolean [
23
], Quaternonian decompositions [
57
], and
not-multilinear decompositions[65,71].
3
PROCESS MODELING, HIDDEN MARKOV MODELS, AND NONNEGATIVE TENSOR
FACTORIZATION WITH MODEL SELECTION
3.1 Theoretical Process Modeling
Constructing a theoretical process model requires specifying the set of activities making up the
process, the order in which these activities take place, the expected duration of each activity, and the
resources necessary to execute each activity. In this endeavor, we rely on subject matter expertise to
craft the theoretical process model that incorporates insights and observations provided by experts
knowledgeable about the process. We utilize the discrete event simulation tool, Simian simulation
engine [58], to actively simulate a process model.
Simian relies on three main components to dene a process model: the precedence constraints,
the activity duration distributions, and the required resources for each activity. Simian takes a
Directed Acyclic Graph (DAG)
𝐺=(𝑉 , 𝐸)
to dene the precedence constraints, where the vertex
set
𝑉={𝑎1, . . . , 𝑎𝑛}
represent activities of the process, and the edge set
𝐸𝑉2
represent precedent
constraints. For example, if edge
(𝑎𝑖, 𝑎𝑗) 𝐸
, activity
𝑎𝑖
must be have been completed before process
𝑎𝑗
can be started. Each vertex in the DAG has a distribution of feasible activity durations. Each
simulation picks values from the distribution for the duration of each activity
𝑎𝑖
. We constrain our
models to have exponentially distributed durations, so the duration of activity
𝑎𝑖1
𝛽𝑖𝑒𝑥/𝛽𝑖
. Simian
also requires the specication of a set of resources
𝑅𝑖𝑅
required for the execution of each activity,
where
𝑅={𝑟1, . . . , 𝑟𝑚}
is a global set of resources. In our study, we assign an innite quantity of
each resource to
𝑅
, ensuring that limited resources will never impede an activity from starting.
Additionally, we assign a probability that each resource will be observed for each combination of
activities simulating the stochastic nature of realistic scenarios. We utilize the resources necessary
for each activity as observable quantities.
A Simian simulation ensures compliance with the activity precedence’s specied by the DAG,
assigns durations to each individual activity, and veries the availability of required resources for
each activity to begin. In each simulation, Simian provides data regarding the start and end times
of each activity. This information can be organized to create timelines of both the unobservable
process model activities and the observable quantities of resource usage. For complex process
models, every simulation run by Simian will have varying durations for each activity, resulting
in diverse combinations of ongoing activities throughout the process. That is to say, the details
of which activities run in parallel, and for how long, will change from simulation to simulation
due to the sampled duration choices. Thus, we resort to statistical analysis on large ensembles of
simulations to calculate probabilities of transitioning between each resource usage state. For our
statistical analysis we join a collection of Simian simulationss so that the end of each simulation
leads into the beginning of the next simulation, eectively making the process model repeat ad
innitum.
摘要:

111GeneratingHiddenMarkovModelsfromProcessModelsThroughNonnegativeTensorFactorizationERIKW.SKAU,InformationSciences,LosAlamosNationalLaboratory,USAANDREWHOLLIS,DepartmentofStatistics,NorthCarolinaStateUniversity,USASTEPHANEIDENBENZ,InformationSciences,LosAlamosNationalLaboratory,USAKIMØ.RASMUSSEN,Th...

展开>> 收起<<
Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:19 页 大小:913.33KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注