
111
Generating Hidden Markov Models from Process Models
Through Nonnegative Tensor Factorization
ERIK W. SKAU, Information Sciences, Los Alamos National Laboratory, USA
ANDREW HOLLIS, Department of Statistics, North Carolina State University, USA
STEPHAN EIDENBENZ, Information Sciences, Los Alamos National Laboratory, USA
KIM Ø. RASMUSSEN, Theoretical Division, Los Alamos National Laboratory, USA
BOIAN S. ALEXANDROV, Theoretical Division, Los Alamos National Laboratory, USA
Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability
of production cycles, quick emergency response, and national security. Process monitoring allows users to
gauge the progress of an organization in an industrial process or predict the degradation or aging of machine
parts in processes taking place at a remote location. Similar to many data science applications, we usually
only have access to limited raw data, such as satellite imagery, short video clips, event logs, and signatures
captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of Subject Matter
Experts (SMEs) who are familiar with the actions of interest. SMEs provide expert knowledge of the essential
activities required for task completion and the resources necessary to carry out each of these activities. Various
process mining techniques have been developed for this type of analysis; typically such approaches combine
theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of
raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models
(as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via nonnegative tensor
factorization. Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative
matrix-tensor factorizations, and (d) custom model selection. To demonstrate our methodology and its abilities,
we apply it on simple synthetic and real world process models.
Additional Key Words and Phrases: Process modeling, Hidden Markov Models, and Nonnegative Tensor
Factorization with Model Selection
1 INTRODUCTION
Process modeling, which is also called process mining, has been developed to analyze complex
business enterprises that involve many people, activities, and resources to guide information systems
engineering. Process models typically obtain their structure from workow logs that describe past
events relating to the enterprise process and specications of how and which resources have been
used [3,70].
When we monitor a specic process in real time, its activities and their temporal sequence are
often not directly observable, and in this sense they remain hidden or latent. For instance, if we
are monitoring an industrial process taking place at a remote/inaccessible location (e.g., building
a industrial complex, such as an oil/liqueed gas terminal), we often have access to only a set of
observables or indicators that underlie the activity, not the activity itself. Additionally, observable
data is often scarce, as with remote sensing and event logs. Domain expert specication of the
sequence of activities and their mean durations is useful to augment scarce data which can be used
in statistical analysis.
Process mining requires a statistical framework capable of accommodating both domain expert
specications and scarce observational data. Given some series of observations from the process, this
statistical framework should allow us to predict what process activity was underway at the time of
Authors’ addresses: Erik W. Skau, Information Sciences, Los Alamos National Laboratory, USA, ewskau@lanl.gov; Andrew
Hollis, Department of Statistics, North Carolina State University, USA, anhollis@ncsu.edu; Stephan Eidenbenz, Information
Sciences, Los Alamos National Laboratory, USA, eidenben@lanl.gov; Kim Ø. Rasmussen, Theoretical Division, Los Alamos
National Laboratory, USA, kor@lanl.gov; Boian S. Alexandrov, Theoretical Division, Los Alamos National Laboratory, USA,
boian@lanl.gov.
arXiv:2210.01060v2 [cs.LG] 26 Apr 2024