Learning Causal Graphs in Manufacturing Domains using Structural Equation Models Maximilian Kertel

2025-05-02 0 0 547.47KB 6 页 10玖币
侵权投诉
Learning Causal Graphs in Manufacturing Domains
using Structural Equation Models
Maximilian Kertel
Technology Development Battery Cell
BMW Group
Munich, Germany
maximilian.kertel@bmw.de
Stefan Harmeling
Department of Computer Science
TU Dortmund University
Dortmund, Germany
stefan.harmeling@tu-dortmund.de
Markus Pauly
Department of Statistics
TU Dortmund University
Dortmund, Germany
Research Center Trustworthy
Data Science and Security
UA Ruhr, Germany
pauly@statistik.tu-dortmund.de
Abstract—Many production processes are characterized by
numerous and complex cause-and-effect relationships. Since they
are only partially known they pose a challenge to effective process
control. In this work we present how Structural Equation Models
can be used for deriving cause-and-effect relationships from
the combination of prior knowledge and process data in the
manufacturing domain. Compared to existing applications, we
do not assume linear relationships leading to more informative
results.
Index Terms—Causal Discovery, Bayesian Networks, Industry
4.0
I. INTRODUCTION
To be published in the Proceedings of IEEE AI4I 2022.
Complex manufacturing processes as, e.g. for battery cells
show high scrap rates and thus high production costs and
large environmental footprints. One of the driving factors is
the missing knowledge on the interdependencies between the
process parameters, intermediate product properties and the
quality characteristics [1]. Together we call this the cause-
and-effect relationships (CERs). CERs can be visualized as a
network with the process and product characteristics as nodes
and the CERs as directed edges [1], [2]. It is the goal of our
paper to unify expert knowledge and process data to derive
such a network, which allows the visual identification of
root-causes of erroneous products,
relevant parameters for process control during successive
production steps and
important characteristics to predict the quality of the final
product.
In complex manufacturing domains, CERs form a linked mesh
of hundreds of involved factors [1]. Typically, CERs are
derived by running Designs of Experiments (DOEs). However,
DOEs can be time-demanding and the production line has to be
stopped in the meantime leading to prohibitively high costs.
Moreover, if there are many potential CERs, the number of
experiments can become infeasible.
At the same time, the Internet of Things (IoT) allows data pro-
cessing and storage along the whole production line, leading
to a vast amount of accessible information. It is thus desirable
to derive the CERs from the existing observational (or non-
experimental) data. For this purpose, Bayesian Networks can
be used to unify expert knowledge and data. From these, CERs
can be derived under the assumption of causal sufficiency [3].
This approach is called causal discovery or structure learning.
The most common example in the manufacturing domain [4]–
[6], is the PC algorithm [3]. This algorithm relies on the
assumption of faithfulness and on efficient statistical tests
for conditional independence. In principle the PC algorithm
can be applied with any test for conditional independence.
However, existing nonparametric tests do not scale well [7],
[8]. Most of the applications of the PC algorithm either
discretize the measurements, or researchers approximate the
joint distribution of the variables by a multivariate normal
distribution. For discrete data and normally distributed data
fast tests for conditional independence exist. However, the
former leads to a loss of information, while the latter requires
a linear dependency between the variables to be exact. In case
of manufacturing data this is most likely a misspecification
[9]. Simulation studies show, that the performance of the PC
algorithm can be poor in case of non-linearity [10]. This
questions the application of the PC algorithm for large or high-
dimensional manufacturing data.
In recent years, Structural Equation Models (SEM), which can
incorporate arbitrary functional relationships, were increas-
ingly proposed to derive Causal Bayesian Networks. They
replace the assumption on faithfulness by a functional form
of the conditional distributions (see Equation (1)). While the
PC algorithm returns a set of graphs, methods based on SEMs
often derive a single graph. To the best of our knowledge, we
are the first to apply SEMs to derive such graphical models
in the manufacturing domain.
The paper is structured as follows. In Section II we present
potential prior knowledge and available data in manufacturing
domains. We continue in Section III by reviewing Bayesian
Networks and SEMs and explain Causal Additive Models
(CAM). In Section IV we present an extension of CAM, called
TCAM, which efficiently incorporates prior knowledge. We
apply our method in Section V to process data of the assembly
of battery modules at BMW. We conclude in Section VI.
arXiv:2210.14573v1 [stat.ML] 26 Oct 2022
II. DATA AND CHALLENGES IN COMPLEX
MANUFACTURING DOMAINS
In this section we describe the data sources and propose a
preprocessing of the data. Then, we explain the broad prior
knowledge in manufacturing domains. Finally, we mention
common challenges with production data.
A. Data Sources along the Production Line
The assembly of products consists of production lines,
which again contain several stations, which are passed in a
fixed order and where process steps are carried out. During
those process steps the piece is transformed or it is combined
with other parts in order to achieve a predefined outcome.
All involved parts are assigned to unique identifiers. Data of
different types is collected along the production process:
Process data: the stations take measurements of the
involved parts (e.g. thickness of the piece) and the pa-
rameters of the machine (e.g. weight of applied glue).
End-of-Line (EoL) tests take additional quality measure-
ments of the intermediate or final products.
Station information: at some production steps the pieces
are spread out to identical stations, such that parts can be
processed in parallel and every piece is assigned to one
of the stations.
Bill of Material (BoM): the BoM contains the information
which pieces were merged together and on which position
they have been worked in.
Supplier data: suppliers transmit data on provided goods.
The preprocessing of the data, which is depicted in Figure 1,
consists of the following steps:
1) Collect the data for every intermediate product.
2) Iteratively merge the data of all subcomponents of a final
product.
Measurements of identical subcomponents, which are placed
in the same position, can be found in the same column.
Eventually, the final tabular data set contains all measurements
that can be associated with a final product.
B. Prior Knowledge
As the stations are passed in a fixed order, we know that
CERs across different stations can only act forward in time.
Additonally, in many manufacturing organizations, tools as the
Failure Mode and Effect Analysis (FMEA) [11] are imple-
mented to extract expert knowledge on CERs in the production
process and to provide the information in a structured form.
C. Challenges of Data Analysis in Manufacturing
Often, similar information is recorded multiple times along
the production line, leading to multicollinearity [4]. Also,
sensors might deliver non-informative data by recording im-
plausible values. Industrial data is also reported to be drifting
over time. However, even in shorter time intervals, data of
a series production contains thousands of observations. This
distinguishes the manufacturing domain from other applica-
tions of causal discovery as medicine, genetics or the social
sciences.
III. STRUCTURE LEARNING OF GRAPHICAL MODELS
A. Some Preliminaries on Graphical Models
Let G= (V,E)be a directed acyclic graph (DAG) [12,
Chapter 6] with nodes V= (V1, . . . , Vp)and edges E. The
node Viis called a parent of Vjif the edge ViVjis in
E. We denote the set of all parents of Vjas pa(Vj). A tuple
of nodes (Vj1, . . . , Vj`), such that Vjkis a parent of Vjk+1 for
all k= 1,...,(`1), is called a directed path. Nodes that
can be reached from Xjthrough a directed path are called the
descendants of Xj.
In the following we denote random vectors with bold letters
as Zand random variables as Z. Let X= (X1, . . . , Xp)be a
random vector representing the data generating process. For a
graph Gwith nodes X1, . . . , Xp, we call (X, G)a Bayesian
network if the local Markov property holds, i.e.
XiXj|pa(Xi)
for any Xjthat is not a descendant of Xiin G. Here, XY|Z
denotes the conditional independence of Xand Ygiven Z. In
that case, we can deduce additional conditional independencies
for Xfrom the graph Gusing the concept of d-separation [12].
For a Bayesian Network (X, G), it then holds that XiXj|S
if Xiand Xjare d-separated by Sin G. On the other hand,
if there is a graph G, such that XiXj|Simplies that Xi
and Xjare d-separated given Sin G, then Xis called faithful
with respect to G. As multiple graphs can contain the same
d-separations, this graph Gis in general not unique.
To promote the intuition, assume that Xhas a joint density f.
Then XiXj|Scan be characterized by
f(xi|Xj=xj,S=s) = f(xi|S=s),
where f(xi|Z=z)denotes the conditional density function
of Xigiven Z=z. Thus, if we already know S, then Xjdoes
not provide additional information on Xi. Assume that we are
interested which variable in {Xj,XS}causes the variable Xi
to be out of the specification limits. Then we know, that the
root causes can be found within S.
B. Graph Learning with Structural Equation Models
While the PC algorithm is the classic approach for deriving
a Causal Bayesian Network, recent research focused on identi-
fying it using acyclic SEMs [10], [13]–[15]. They assume that
there exists a permutation Π0(1, . . . , p) = π0(1), . . . , π0(p)
and functions {f`, ` = 1, . . . , p}, such that
X`=f`(X`1, . . . , X`v, ε`), ` = 1, . . . , p, (1)
where π0(`k)< π0(`)for all k= 1, . . . , v and ε1, . . . , εpare
i.i.d. noise terms. As the estimation of f`in Equation (1) is
difficult in high dimensions, one typically restricts the function
class and the distribution of the noise terms. In this work, we
assume that the functions follow the additive form
f`(X`1, . . . , X`v, ε`) = c`+X
k:π0(k)0(`)
fk,`(Xk) + ε`,(2)
where ε` N (0, σ`)and c`R. To ensure the uniqueness
of the fk,` and without loss of generality, we set E(X`)=0
摘要:

LearningCausalGraphsinManufacturingDomainsusingStructuralEquationModelsMaximilianKertelTechnologyDevelopmentBatteryCellBMWGroupMunich,Germanymaximilian.kertel@bmw.deStefanHarmelingDepartmentofComputerScienceTUDortmundUniversityDortmund,Germanystefan.harmeling@tu-dortmund.deMarkusPaulyDepartmentofSta...

展开>> 收起<<
Learning Causal Graphs in Manufacturing Domains using Structural Equation Models Maximilian Kertel.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:547.47KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注