SELF-VALIDATED PHYSICS -EMBEDDING NETWORK A GENERAL FRAMEWORK FOR INVERSE MODELLING Ruiyuan Kang

2025-04-26 0 0 2.13MB 32 页 10玖币
侵权投诉
SELF-VALIDATED PHYSICS-EMBEDDING NETWORK: A
GENERAL FRAMEWORK FOR INVERSE MODELLING
Ruiyuan Kang
Department of Mechanical Engineering
Khalifa University
Abu Dhabi, UAE
ruiyuan.kang@ku.ac.ae
Dimitrios C. Kyritsis
Department of Mechanical Engineering, RICH Center
Khalifa University
Abu Dhabi, UAE
dimitrios.kyritsis@ku.ac.ae
Panos Liatsis
Department of Electrical Engineering and Computer Science
Khalifa University
Abu Dhabi, UAE
panos.liatsis@ku.ac.ae
ABSTRACT
Physics-based inverse modeling techniques are typically restricted to particular research fields,
whereas popular machine-learning-based ones are too data-dependent to guarantee the physical
compatibility of the solution. In this paper, Self-Validated Physics-Embedding Network (SVPEN),
a general neural network framework for inverse modeling is proposed. As its name suggests, the
embedded physical forward model ensures that any solution that successfully passes its validation is
physically reasonable. SVPEN operates in two modes: (a) the inverse function mode offers rapid
state estimation as conventional supervised learning, and (b) the optimization mode offers a way to
iteratively correct estimations that fail the validation process. Furthermore, the optimization mode
provides SVPEN with reconfigurability i.e., replacing components like neural networks, physical
models, and error calculations at will to solve a series of distinct inverse problems without pretraining.
More than ten case studies in two highly nonlinear and entirely distinct applications—molecular
absorption spectroscopy and Turbofan cycle analysis—demonstrate the generality, physical reliability,
and reconfigurability of SVPEN. More importantly, SVPEN offers a solid foundation to use existing
physical models within the context of AI, so as to striking a balance between data-driven and
physics-driven models.
Keywords Inverse Modelling ·Physics Embedding ·Neural Network
1 Introduction
A forward problem can be defined as finding the results (observations)
y
from a given causes (states)
x
, the solving
processes which formulates the physical models
F
to map causes (states) to are called forward modelling, i.e.,
y=
F(x)
,which is demonstrated by the blue line in Fig. 1. On the other hand, the counterpart problem that ascertains
the states from the measured observations is named inverse problem, similarly, its solving process is named inverse
modelling. Forward and inverse problems are the opposite sides of scientific and engineering problems that can be
seen everywhere. For example, two forward problems can be defined as follows: (a), given the concentration and
temperature of the gas cloud of a certain molecule, to calculate its absorption spectroscopy, which is often termed
Molecular Absorption Spectroscopy (MAS) simulation. (b), given the cycle parameters and components parameters of
an aeroengine, to calculate its performance parameters, such as thrust and thrust specific fuel consumption, which is
often termed aeroengine performance simulation. Their corresponding inverse problems can be defined as: (a), from the
Update information:Kang R. et al. updated on Octobor 17,2022 DOI:10.48550/arXiv.2210.06071
arXiv:2210.06071v3 [cs.NE] 17 Jan 2023
Kang R. et al.
molecular absorption spectroscopy measure, to ascertain the concentration and temperature of gas cloud of a specific
molecule. (b) given the requirement of aeroengine performance, to ascertain the cycle and component parameters which
satisfy the performance requirements, which is usually termed cycle analysis in aeroengine design.
Figure 1: A high-level representation of forward and inverse modelling methods
In general, the methods solving the inverse problem can be categorized into two types: (a) inverse function method, as
the first red line shown in Fig. 1. The inverse process is modelled, so that the state corresponding to a given observation
can be directly calculated, i.e., x=G(y)
=F1(y), (b) Optimization method, as the second red line shown in Fig. 1.
The state corresponding to the given observation is ascertained by iterating the estimated state
ˆx
till a similar observation
is acquired from the forward model. i.e., iterate ˆx, till F(ˆx) = y.
Substantial physics-knowledge-based solutions, inverse-function or optimization ones, have been developed for ap-
plications in specific science and engineering fields [
1
,
2
,
3
,
4
,
5
,
6
], however, they require substantial background
knowledge and expertise. Moreover, the details of their applications constrain their generalization in other fields.
Conversely, machine-learning-based solutions have been recently gaining popularity, due to their powerful computation
capability, high flexibility, and reduced requirements for background knowledge. In the scope of inverse function
method, a Machine Learning (ML) model G(y)is trained to approximate the inverse function F1(y)of the physical
forward model [
7
,
8
,
9
,
10
,
11
,
12
,
13
], by using the information conveyed by a large amount of experimental or
simulated data. Accordingly, for a given observation
y
, the ML model is expected to give an efficient estimation of the
corresponding state.
However, such estimations of states may be biased, since the inverse problems are usually rank-deficient ill-posed [
14
],
which means many states exist for a specific observation. Fortunately, the forward problems are always well-posed [
15
].
Therefore, in the scope of optimization method, ML models are usually trained to be the surrogate models
ˆ
F(x)
of the
physical forward models
F(x)
[
16
,
17
,
18
], which are then used to provide an accurate estimation of the observation
for the given state. Then optimization methods are utilized to iterate the state estimation
ˆx
till the estimated observation
ˆy
from
ˆ
F(x)
is similar enough to the actual observation
y
. The introduction of the surrogate model brings two core
benefits. First, it provides for rapid mapping from the space of states to the space of observations, which in turn support
the implementation of time-consuming computations, such as Computational Fluid Dynamics (CFD) [
19
]. Second,
the inherent differentiability properties of the network assures the appropriateness of gradient-based optimization
algorithms [20].
The main issue with the use of the aforementioned methods is they are purely data-driven. Since there is no guarantee
that the dataset reflects the true distributions of the space of states and observations, the resulting ML models cannot
assure learning a sufficiently accurate mapping between these two spaces [
21
]. Accordingly, ML models may not
provide fairly accurate estimations for noisy data or samples outside the dataset, i.e., weak robustness [
22
,
23
] and
generalization [
24
]. The consequence is that it is not possible to guarantee the reliability of the ML estimations, thus
limiting the application of purely data-driven methods for the solution of the inverse problem.
An alleviation to this problem is injecting physical knowledge within the modelling framework. Examples of such
approaches include the use of PDE-based regularization [
25
,
26
,
27
,
28
] to reflect the physical conservations the model
should obey, or the results from complex physical models to constrain the training direction of the model [
29
,
30
],
so that a deeper correlation between the ML and physical models can be built. Nevertheless, still no one can assure
data-driven models act exactly as physical models, especially in corner cases or unseen cases. A thorough solution,
which can also be regarded as injecting physical knowledge, is using the forward model of the physical process directly
2
Kang R. et al.
instead of a data-driven one, but using a network to guide the direction of state search [
31
], as often done in meta
learning [
32
,
33
]. This kind of method is still in the scope of optimization method, If the physical model embedded is
believed to be trustful, the final state estimation can thus be considered to be reliable. However, the network needs to be
pretrained on the designated physical model, which means that the mismatch between the network and the physical
model during testing could lead to incorrect search direction, getting stuck in local optima, and eventually cannot
converge.
In summary, inverse function methods allow for efficient estimations, while optimization methods, coupled with physical
forward models, provide for effective estimations. However, all these ML-based methods require pre-training, which in
turn translates to requirements for preparation time and resources prior to model deployment, and more important, the
concerns in regard to the generalization performance of the model on the testing dataset.
In this work, we propose a general purpose, neural networks-based framework for the solution of inverse problems,
termed Self-Validated Physics-Embedding Network (SVPEN). The principle of SVPEN is to couple both the inverse
function and optimization methods, through embedding physical forward models into neural networks. By utilizing
physical forward model to validate the quality of estimated state, the system ensures that final state estimations are
physically reasonable, while the two problem solving modes of inverse function and optimization lend SVPEN with
their advantages of efficient and effective state estimations. Moreover, SVPEN can be deployed without the use of
pre-collected dataset and pre-training, while its structure can be adapted to update the underlying physical/ML models.
In order to demonstrate the advantages of SVPEN, this contribution considers two diverse inverse problems as we
introduced above, namely, retrieval of temperature and gas concentration from MAC, and aeroengine cycle analysis from
performance requirements (code repo:
https://github.com/RalphKang/SVPEN_1.0
). The results demonstrate the
high efficiency, effectiveness, flexibility and adaptivity of the proposed framework.
2 SVPEN
The skeleton of Self-Validated Physics-Embedding Network (SVPEN) is demonstrated in Fig. 2. In general, SVPEN is
constituted of inverse function and optimization modes. Once an observation is fed to SVPEN, the inverse function
mode is first used to give a quick estimation of the state; meanwhile, an estimation error which reflects the quality of
state estimation is also provided. The estimated state is accepted when the estimation error is smaller than an error
threshold
ε
, otherwise, the system automatically switches to the optimization mode. In the optimization mode, the
estimation error is gradually reduced to be smaller than the error threshold
ε
by performing gradient descent on the
estimated state. The control variable, i.e., estimation error, is determined by the physical forward model embedded
in both modes, thereby, assuring the acceptable estimated state is physically reasonable. In this section, we introduce
in detail the design of SVPEN, while in section 2.1, we present the design of inverse function mode. In section 2.2,
attention shifts to the introduction of the optimization mode, and finally in section 2.3, we explain the cooperation of
these two modes within the structure of SVPEN, and the associated benefits brought about.
2.1 Inverse function mode
The structure of the inverse function mode is schematically shown in Fig. 3, which is mainly constituted of three
computational components, i.e., state estimator
G1
, physical forward model
F
, and error calculation component
E
.
The state estimator is a ML model, which takes the observation
y
as the input, and transforms it to an estimated state
ˆx
.
Then the acquired
ˆx
is fed to
F
to generate an estimated observation
ˆy
. The difference between estimated and given
observations, and even the prior requirement of state can be used to calculate an error
e
, which assess the quality of
ˆx
.
The whole process can be expressed into Eqs. 1- 3.
ˆx=G1(y)(1)
ˆy=F(ˆx)(2)
e=E(y, ˆy, ˆx)(3)
According to the functionalities of physical forward model and error calculation component, they can be wrapped
together into an integrated module termed physical evaluation module, which is used to assess the state estimation
quality from a physical perspective.
2.1.1 State estimator
One can recognize that the functionality of the state estimator is not different to those of ML models in the inverse
function method [
7
,
8
,
9
,
20
,
25
,
34
], i.e., it is expected to transform observations into accurate estimations of states. In
3
Kang R. et al.
Figure 2: The skeleton of SVPEN
Figure 3: The schematic of the inverse function mode
order to realize such a purpose, the state estimator must be trained to learn how to map observations to states before its
deployment. Similarly to the inverse function method, the training is also done in a supervised way by utilizing the
observation-state pairs collected from physical experiments or simulation studies. This supervised training process is
termed as pretraining herein, which in order to differentiate it from the online optimization process in optimization
mode (section 2.2).
It is noteworthy that the state estimator must be a differentiable ML model. Indeed, this requirement is the foundation
of the optimization mode, which will be illustrated in section 2.2. Accordingly, a neural-network-type model is used
as default in SVPEN due to its natural differentiability, flexible structure, and extensive research foundation. For
demonstration purposes, two architectures, i.e., VGG-13, a classical CNN architecture, and a multi-layer perceptron
(MLP), are respectively used in the two application cases, presented in sections 3.1 and 3.2.
4
Kang R. et al.
2.1.2 Physical evaluation module
As previously mentioned, the physical evaluation module consists of two operations. This module takes the given
observation and the estimated state, and then feedbacks with a physics-determined error
e
to assess the state estimation
quality. The crucial aspect of
e
is that it reflects the discrepancy between the given observation and the estimated obser-
vation corresponding to the estimated state. In general, only physical models of the forward process are recommended
to be used to map the estimated state to its corresponding observation, since the mapping provided by such models is
regarded as physically trustful. The discrepancy
ey
between the given and estimated observations can be calculated
by a variety of distance functions, such as the Manhattan distance, the Euclidean distance, or a monotonic nonlinear
transformation of distance functions, an example of which is used in section 3.1.1.
However, there are instances where only using the
ey
is insufficient, since the inverse problem is often rank-deficient
and ill-posed [
14
], i.e., multiple estimates of state may lead to similar discrepancies that smaller than the error threshold
ε
. To tackle this issue, prior-knowledge regularization about the observations and states (shown as the dashed line in
Fig. 3) can be utilized as a part of
e
to further constrain the state estimation. For example, in practice, the feasible
domain of state is usually ascertained. As for the MAS respectively acquired from a flame and ordinary atmosphere, we
do know the temperature retrieved from these two MAC, will fall into the range of a few thousand K and a few hundred
K, respectively. Consequently, a regularization term can be set as shown in Eq. 4:
ereg =M ax(ˆxxmax,0) + M ax(xmin ˆx, 0) (4)
Where,
ereg
is the regularization term,
xmax
and
xmin
are the upper and lower boundaries of the feasible domain of
state, respectively. Therefore, the state estimation
ˆx
exceeds the feasible domain will cause extra error. Besides, the use
of traditional L
1
/L
2
norms to constrain the magnitude of the state, or the use of PDE-based regularizations to reflect the
physical laws [
35
], etc., can also be applied, depending on the inverse problem to be solved. For instance, the total
error e can be defined as Eq. 5:
e=c0ey+
n
X
i=1
ciereg,i (5)
where the
c
s are the weights of the discrepancy /regularization terms. The choice of weights affects the evaluation of
the estimated states. For example, for a problem where no regularizations priors are needed, all weight values except
c0
are set to zero. The calculation of
e
and all individual discrepancy/regularization error terms is performed by the error
calculation component E.
2.2 Optimization mode
However, as we discussed above, the state estimator
G1
cannot be assured to always provide reasonable estimations, as
it is a data-driven model. In cases where the calculated error
e
is greater than the default error threshold
ε
, the estimated
state is supposed to be far from its true state, and is thus, deemed as unacceptable. Under such conditions, SVPEN
switched to the optimization mode. The purpose of the optimization mode is to utilize the backpropagation of error
to optimize the estimation of state, until error
e
satisfies the error threshold
ε
or the number of iterations reaches the
selected maximum of t.
However, the fact is that not all physical forward models are fully differentiable, which means that the back propagation
route from the error
e
to the state estimator
G1
is interrupted, as the cross symbol shown in Fig. 4. Using the adjoint state
method [
36
] or adding perturbations to every entry of the state could support calculation of the vector of derivatives,
however, these require complex transformations or frequent evaluations of the physical model in every iteration, leading
to increased computational requirements.
Figure 4: The dilemma that back-propagation unavailability for not differentiable physical model
In this research, we exploit the differentiability of the network structure of
G1
, by bridging the error
e
and the state
estimator
G1
through adding another network-type learner
G2
to the original architecture, as shown in Fig. 5. The
5
摘要:

SELF-VALIDATEDPHYSICS-EMBEDDINGNETWORK:AGENERALFRAMEWORKFORINVERSEMODELLINGRuiyuanKangDepartmentofMechanicalEngineeringKhalifaUniversityAbuDhabi,UAEruiyuan.kang@ku.ac.aeDimitriosC.KyritsisDepartmentofMechanicalEngineering,RICHCenterKhalifaUniversityAbuDhabi,UAEdimitrios.kyritsis@ku.ac.aePanosLiatsi...

展开>> 收起<<
SELF-VALIDATED PHYSICS -EMBEDDING NETWORK A GENERAL FRAMEWORK FOR INVERSE MODELLING Ruiyuan Kang.pdf

共32页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:32 页 大小:2.13MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 32
客服
关注