Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization

2025-05-06 0 0 913.33KB 19 页 10玖币

侵权投诉

111

Generating Hidden Markov Models from Process Models

Through Nonnegative Tensor Factorization

ERIK W. SKAU, Information Sciences, Los Alamos National Laboratory, USA

ANDREW HOLLIS, Department of Statistics, North Carolina State University, USA

STEPHAN EIDENBENZ, Information Sciences, Los Alamos National Laboratory, USA

KIM Ø. RASMUSSEN, Theoretical Division, Los Alamos National Laboratory, USA

BOIAN S. ALEXANDROV, Theoretical Division, Los Alamos National Laboratory, USA

Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability

of production cycles, quick emergency response, and national security. Process monitoring allows users to

gauge the progress of an organization in an industrial process or predict the degradation or aging of machine

parts in processes taking place at a remote location. Similar to many data science applications, we usually

only have access to limited raw data, such as satellite imagery, short video clips, event logs, and signatures

captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of Subject Matter

Experts (SMEs) who are familiar with the actions of interest. SMEs provide expert knowledge of the essential

activities required for task completion and the resources necessary to carry out each of these activities. Various

process mining techniques have been developed for this type of analysis; typically such approaches combine

theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of

raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models

(as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via nonnegative tensor

factorization. Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative

matrix-tensor factorizations, and (d) custom model selection. To demonstrate our methodology and its abilities,

we apply it on simple synthetic and real world process models.

Additional Key Words and Phrases: Process modeling, Hidden Markov Models, and Nonnegative Tensor

Factorization with Model Selection

1 INTRODUCTION

Process modeling, which is also called process mining, has been developed to analyze complex

business enterprises that involve many people, activities, and resources to guide information systems

engineering. Process models typically obtain their structure from workow logs that describe past

events relating to the enterprise process and specications of how and which resources have been

used [3,70].

When we monitor a specic process in real time, its activities and their temporal sequence are

often not directly observable, and in this sense they remain hidden or latent. For instance, if we

are monitoring an industrial process taking place at a remote/inaccessible location (e.g., building

a industrial complex, such as an oil/liqueed gas terminal), we often have access to only a set of

observables or indicators that underlie the activity, not the activity itself. Additionally, observable

data is often scarce, as with remote sensing and event logs. Domain expert specication of the

sequence of activities and their mean durations is useful to augment scarce data which can be used

in statistical analysis.

Process mining requires a statistical framework capable of accommodating both domain expert

specications and scarce observational data. Given some series of observations from the process, this

statistical framework should allow us to predict what process activity was underway at the time of

Authors’ addresses: Erik W. Skau, Information Sciences, Los Alamos National Laboratory, USA, ewskau@lanl.gov; Andrew

Hollis, Department of Statistics, North Carolina State University, USA, anhollis@ncsu.edu; Stephan Eidenbenz, Information

Sciences, Los Alamos National Laboratory, USA, eidenben@lanl.gov; Kim Ø. Rasmussen, Theoretical Division, Los Alamos

National Laboratory, USA, kor@lanl.gov; Boian S. Alexandrov, Theoretical Division, Los Alamos National Laboratory, USA,

boian@lanl.gov.

arXiv:2210.01060v2 [cs.LG] 26 Apr 2024

111:2 Skau et al.

Theoretical

Process Model

HMM (not minimal)

Discrete Event

Simulation

Joint Probability

Matrix and Tensor

Construction

Coupled Nonnega-

tive Matrix-Tensor

Factorization with

Model Selection

Minimal HMM

Fig. 1. Flow chart illustrating the pipeline to construct a minimal HMM from a theoretical process model.

the observations and how long it will be before the process is complete. Additionally, the statistical

framework should be able to accurately evaluate the likelihood of competing domain expert process

models based on limited observations. To answer these questions the statistical framework should

describe the dynamics of the underlying activities (which are not directly observable) and how

these hidden activities are related to the available observational data.

One such statistical framework is the Hidden Markov Model (HMM) introduced by Baum et al.

[

]. HMMs are a broadly used method for modeling data sequences and have been successfully

applied in various elds, such as: signal processing, machine learning, speech recognition, hand-

written character recognition, and in many other elds, see e.g., [

]. One

of the well-known problems with HMMs is model selection, that is, how to estimate the optimal

number of its hidden states. The HMM topology and number of hidden states have to be known

prior to its utilization, and various heuristic rules have been proposed to estimate the HMM latent

structure, see e.g., [

]. Thus, the challenge inherent in using HMMs is that we must identify the

minimal number of hidden states induced by the process model. We can then estimate the HMM

parameters that describe the latent dynamics and the observational probabilities associated with

each of the hidden states.

In this paper, we present a new method for building a minimal HMM based on a theoretical

process model proposed by domain experts. Our method integrates: (a) theoretical process models,

(b) HMMs, (c) coupled nonnegative matrix-tensor factorizations, and (d) custom model selection.

The relative relations between these components can be seen in Figure 1.

In summary, our contributions are:

•

We demonstrate how to directly use the structure of a theoretical process model to build

an HMM, where combinations of the process model activities are the HMM hidden states.

One problem with this approach is the number of hidden states may not be minimal. Such

misspecication can lead to a poor HMM parameterization, erroneous HMM predictions,

and poor identiability.

•

We show how to build an HMM model via coupled nonnegative matrix-tensor factorization,

based on pairwise co-occurrence events [

] generated from discrete event simulations

[62] of a theoretical process model.

•

Using the fact that tensor factorizations (and more specically tensor ranks [

]) can be

used to derive the optimal (i.e., the minimal) number of HMM states [

], we utilize

here our new method for Nonnegative Matrix Factorization (NMF) and Nonnegative Tensor

Factorization (NTF) model selection [

], and use it to determine the number of the hidden

states based on the stability of the tensor decomposition.

•

Finally, we demonstrate the results of our new method on synthetic and real world examples.

Generating HMMs from Process Models Through Nonnegative Tensor Factorization 111:3

2 RELATED WORKS

In general, not much previous work exists that explores the connection between process modeling

and HMMs. An HMM was proposed to sequence clustering in process mining [

]. In [

], the

authors constructed an HMM for resource allocation, and used an HMM to identify the observations

based on the resources utilized by each activity. HMMs have also been used for the representation

of process mining of parallel business processes [

]. Similarly, HMMs were used for extraction

of activity information recorded in the bank event log, for calculating the fraud probabilities [

]

and for detection of anomalies in business processes [

]. The authors of [

] applied HMMs

for business process management mining for failure detection from event logs. In the area of

healthcare and medicine, HMMs were used for complex healthcare process analysis and detection

of hidden activities [

]. Recently it was demonstrated how to build a semi-Markov model based on

event sequences, [

], and to use it for automatic activity discovery and analysis. Finally, although

there are several works relating minimal HMMs and tensor factorization (see for example [

]

and references therein), to the best of our knowledge, our work is the rst one that explores the

connection between theoretical process modeling and tensor factorization. There are numerous

works highlighting the importance of accurate tensor factorizations and the diculty of nonnegative

tensor factorizations [

]. In this work we simultaneously decompose matrices and

tensors with a joint least squared objective function. This type of minimization objective is often

called coupled matrix-tensor factorization [

], though every instance requires slightly dierent

formulations as the factorization structures vary between applications.

One necessity of HMMs is the determination of the latent dimension. When using matrix or

tensor factorization to derive HMMs, this dimension is called the rank and is typically denoted

with "k". The rank is typically elusive in practical applications, demanding an estimation derived

exclusively from the data. Despite the absence of a polynomial-time algorithm to ascertain a tensor’s

rank - a task proven to be NP-hard [

] - various heuristic methodologies have been developed for

nonnegative rank estimation. One prominent technique is the core consistency diagnostic, known as

CORCONDIA [

]. This method evaluates deviations from a super-diagonal core-tensor, estimating

the number of components, k, where this deviation reaches a minimum. Despite its intuitive appeal

and widespread application, CORCONDIA lacks a solid theoretical foundation. Particularly for

nonnegative tensor factorization, it encounters challenges when contrasting a nonnegative Tucker

Decomposition with the corresponding nonnegative Canonical Polyadic Decomposition (CPD). For

the complications of these two nonnegative decompositions see [

]. An alternative strategy is the

Bayesian approach, incorporating Automatic Relevance Determination (ARD). Initially devised for

neural networks [

] and subsequently adapted for Bayesian PCA [

], and then to NMF [

and tensors [

], ARD is an alternative approach to identify the number of latent dimensions. Other

strategies employ diverse statistical criteria, including minimum description length [

], Akaike

Information Criterion (AIC) [

], and Bayesian Information Criterion (BIC) [

] with partial success.

Determining the optimal nonnegative model - that is, the nonnegative rank - remains a formidable

challenge, with varying approaches often leading to disparate solutions [

]. The methodology

employed in our study hinges on the well-posed nature of the low-rank approximation of the

nonnegative CPD, where the solutions are almost always unique [

]. We determine the number of

latent components, k, by analyzing the stability of the solutions from factorizations across a set

of plausible k values [

]. Our method demonstrated superior ecacy when tested on numerous

synthetic datasets with pre-established latent features [

]. Additionally, our approach leverages

the theorem that a unique NMF solution, for a matrix: X = WH, when X is subjected to minor noise

or perturbations, results in minimal disruptions to the factors W and H [36].

111:4 Skau et al.

Our method has seen successful application in various domains, including the decomposition of

the largest human cancer genome data [

], synthetic cancer data [

], microphase separation of

block copolymers, via Nonnegative CPD [

], analysis of international trade ows [

] and exabyte

synthetic data [

], via RESCAL, Nonnegative Tucker decompositions for images [

] as well as

for large set of trajectories of the reaction diusion equation [

], among others. Our methodology

has also been utilized for model selection in Boolean [

], Quaternonian decompositions [

], and

not-multilinear decompositions[65,71].

PROCESS MODELING, HIDDEN MARKOV MODELS, AND NONNEGATIVE TENSOR

FACTORIZATION WITH MODEL SELECTION

3.1 Theoretical Process Modeling

Constructing a theoretical process model requires specifying the set of activities making up the

process, the order in which these activities take place, the expected duration of each activity, and the

resources necessary to execute each activity. In this endeavor, we rely on subject matter expertise to

craft the theoretical process model that incorporates insights and observations provided by experts

knowledgeable about the process. We utilize the discrete event simulation tool, Simian simulation

engine [58], to actively simulate a process model.

Simian relies on three main components to dene a process model: the precedence constraints,

the activity duration distributions, and the required resources for each activity. Simian takes a

Directed Acyclic Graph (DAG)

𝐺=(𝑉 , 𝐸)

to dene the precedence constraints, where the vertex

set

𝑉={𝑎1, . . . , 𝑎𝑛}

represent activities of the process, and the edge set

𝐸⊆𝑉2

represent precedent

constraints. For example, if edge

(𝑎𝑖, 𝑎𝑗) ∈ 𝐸

, activity

𝑎𝑖

must be have been completed before process

𝑎𝑗

can be started. Each vertex in the DAG has a distribution of feasible activity durations. Each

simulation picks values from the distribution for the duration of each activity

𝑎𝑖

. We constrain our

models to have exponentially distributed durations, so the duration of activity

𝑎𝑖∼1

𝛽𝑖𝑒−𝑥/𝛽𝑖

. Simian

also requires the specication of a set of resources

𝑅𝑖⊆𝑅

required for the execution of each activity,

where

𝑅={𝑟1, . . . , 𝑟𝑚}

is a global set of resources. In our study, we assign an innite quantity of

each resource to

𝑅

, ensuring that limited resources will never impede an activity from starting.

Additionally, we assign a probability that each resource will be observed for each combination of

activities simulating the stochastic nature of realistic scenarios. We utilize the resources necessary

for each activity as observable quantities.

A Simian simulation ensures compliance with the activity precedence’s specied by the DAG,

assigns durations to each individual activity, and veries the availability of required resources for

each activity to begin. In each simulation, Simian provides data regarding the start and end times

of each activity. This information can be organized to create timelines of both the unobservable

process model activities and the observable quantities of resource usage. For complex process

models, every simulation run by Simian will have varying durations for each activity, resulting

in diverse combinations of ongoing activities throughout the process. That is to say, the details

of which activities run in parallel, and for how long, will change from simulation to simulation

due to the sampled duration choices. Thus, we resort to statistical analysis on large ensembles of

simulations to calculate probabilities of transitioning between each resource usage state. For our

statistical analysis we join a collection of Simian simulationss so that the end of each simulation

leads into the beginning of the next simulation, eectively making the process model repeat ad

innitum.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

111GeneratingHiddenMarkovModelsfromProcessModelsThroughNonnegativeTensorFactorizationERIKW.SKAU,InformationSciences,LosAlamosNationalLaboratory,USAANDREWHOLLIS,DepartmentofStatistics,NorthCarolinaStateUniversity,USASTEPHANEIDENBENZ,InformationSciences,LosAlamosNationalLaboratory,USAKIMØ.RASMUSSEN,Th...

展开>> 收起<<

Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization.pdf

共19页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: