Kalman-Bucy-Informed Neural Network for System Identification

2025-04-15 3 0 598.69KB 6 页 10玖币
侵权投诉
Kalman-Bucy-Informed Neural Network for System Identification
Tobias Nagel and Marco F. Huber
Abstract Identifying parameters in a system of nonlinear,
ordinary differential equations is vital for designing a robust
controller. However, if the system is stochastic in its nature or if
only noisy measurements are available, standard optimization
algorithms for system identification usually fail. We present a
new approach that combines the recent advances in physics-
informed neural networks and the well-known achievements of
Kalman filters in order to find parameters in a continuous-time
system with noisy measurements. In doing so, our approach
allows estimating the parameters together with the mean value
and covariance matrix of the system’s state vector. We show
that the method works for complex systems by identifying the
parameters of a double pendulum.
I. INTRODUCTION
Controlling a dynamical system in a safe manner re-
quires a model that describes the system properties precisely.
Ordinary differential equations (ODEs) are often used to
satisfy this requirement. Besides setting up the corresponding
equation operators, it is also inevitable to identify the real-
valued coefficients that define the characteristics of the
system. Estimating these parameters by using measurements
is termed as “inverse problem” and can be a difficult task,
depending on the system’s complexity. This work presents a
new method that is capable of identifying unknown param-
eters in a nonlinear ODE system, based on noisy measure-
ments by using an extended Kalman-Bucy filter (EKBF) in
a machine learning framework.
For linear systems, the subspace-based state space identi-
fication methods are well established. They aim at finding a
linear state space model by using a regularized least-squares
algorithm [7]. If the system comprises nonlinear behavior,
the most straightforward solution approaches for parame-
ter identification are standard minimization techniques like
gradient-based [9] or gradient-free [1] methods. For a system
with noisy measurements, the problem becomes even more
difficult and requires incorporating stochastic moments in
the optimization. Raue et. al. summarize their experiences
of fitting measurements of biological systems to their cor-
responding ODE system by maximizing a log-likelihood
function that comprises a normally distributed measurement
noise [13]. However, these methods require a numerical so-
lution of the ODE, repeatedly for each optimization iteration.
Besides being very time consuming, this approach often fails
Tobias Nagel and Marco F. Huber are with the Fraunhofer Institute for
Manufacturing Engineering and Automation IPA, Center for Cyber Cog-
nitive Intelligence (CCI), 70569 Stuttgart, Germany {tobias.nagel,
marco.huber}@ipa.fraunhofer.de
Marco F. Huber is with the Institute of Industrial Manufacturing
and Management IFF, University of Stuttgart, 70569 Stuttgart, Germany
marco.huber@ieee.org
because of the system’s nonlinearity, the noise influence or
an unstable behavior in the numerical solution [6].
A possibility to circumvent the problem with a machine
learning approach is described in [16] by using a neural
network to improve a Kalman filter system in order to obtain
a better state estimate. Though, this does not give us the
actual system parameter values but compensates for model
errors. In 2017, Raissi et. al. presented how physics-informed
neural networks (PINNs) can be trained by using modern
automatic differentiation frameworks [12]. The approach
utilizes deep neural networks to discover and solve nonlinear
differential equation systems. This is achieved by training a
neural network to represent an approximate solution to the
differential equation. The method also enables a parameter
search, by including the unknown parameters as additional
network weights. The concept has been applied in numerous
research fields, e.g., mechanics [8], thermodynamics [10] or
in chemical reaction equations [4]. PINNs also enable the
possibility to include stochastic behavior in the modeling
process. Recently, this has been addressed by O’Leary et. al.
who incorporate a mean value of the state and its covariance
matrix in the framework, leveraging it to a stochastic physics-
informed neural network (SPINN) [11]. The authors do so by
propagating the first two central moments of a state variable
through the known differential equation systems. Afterwards
a neural network is trained in order to match these estimated
central moments to measured ones. However, the authors
do not address the problem of identifying parameters in the
system. Another option is to use a Bayesian neural network
(BNN) in a PINN environment which allows an embedding
of uncertainty and, hence, the usage of stochastic differential
equations. Yang et. al. use a BNN [18] to include noisy data
into a partial differential equation problem in order to solve
as well as identify the system [18]. However, BNNs are often
not capable of achieving the same approximation accuracy as
standard neural networks and are significantly more difficult
to train.
In this paper, we present a new physics-informed machine
learning approach that we call Kalman-Bucy-informed neural
network (KBINN). A Kalman-Bucy filter incorporates two
ODEs that describe the temporal evolution of the mean
value and the covariance matrix of the system’s state. In
our method, we include two neural networks that are im-
plemented in a PINN framework in order to approximate
a solution to the Kalman-Bucy equations. This allows an
implicit identification of unknown system parameters by
incorporating them into the network training. The rest of the
paper is organized as follows: In Section II, we give a short
mathematical formulation of the problem. Section III intro-
arXiv:2210.03424v1 [eess.SY] 7 Oct 2022
duces the extended Kalman-Bucy filter (EKBF) and gives
a short summary of neural networks. Section IV describes
the KBINN method, followed by validation experiments in
Section V. We discuss the strengths and limitations of our
method in Section VI and close the paper with a conclusion
in Section VII.
II. PROBLEM FORMULATION
The state space representation of a continuous-time, non-
linear, dynamic and time-variant system of rank nNis
defined by means of
˙x(t) = f(x(t),u(t),w(t), t, θ)
y(t) = g(x(t),u(t),v(t), t),(1)
where f(·)is the nonlinear ODE and g(·)is the measurement
function. Both are assumed to be known, except for a set of
unknown parameters. Furthermore, x(t)Rnwith t0
denotes the state vector, u(t)Rpand y(t)Rqdenote
the input and output signal with dimensions p, q N,
respectively. The vectors w(t)Rnand v(t)Rqdenote
white process noise and white measurement noise, respec-
tively, which are both assumed to be zero-mean Gaussian
with covariance matrices Q(t)Rn×nand R(t)Rq×q,
respectively. This induces the state x(t)to be a random
variable as well. θRddenotes a vector of dN
unknown parameters of the ODE system. If we acquire noisy
measurements y(ti)of the system at Ndiscrete time steps
tiwith i= 1, . . . , N, the aim of our method is to find a
parameter vector θthat minimizes the objective function
θ= arg min
θ(y(ti)y(ti))2.(2)
Standard minimization algorithms require to solve the ODE
in Eq. (1) at each iteration numerically in order to minimize
Eq. (2). As has been mentioned in Section I, these algorithms
often fail due to the necessary small step width which is
imposed by the system’s nonlinearity or the noise.
III. PRELIMINARIES
In this section, we briefly introduce the EKBF, which
is based on the well-known Kalman filter, applied to the
nonlinear and continuous-time case. Afterwards, we give a
short introduction to artificial neural networks.
A. Extended Kalman-Bucy Filter
It is obvious from Eq. (1) that the state xis not di-
rectly measurable. To compensate for this problem, Rudolf
E. Kalman introduced in 1960 the Kalman filter first
for discrete-time, linear systems and a year later for the
continuous-time case, together with Richard S. Bucy [5].
This concept allows estimating the mean value and the co-
variance matrix of the system’s state. If the noise is assumed
to be Gaussian, the first two central moments are sufficient
to describe the state’s probability distribution exactly. If a
nonlinear system is considered, it is necessary to perform
linearizations at each time step which leads to the EKBF. It
is composed of two initial value problems that comprises the
state’s estimated mean value
˙
ˆx(t) = f(ˆx(t),u(t),0, t)
+K(t)·(y(t)g(ˆx(t),u(t),0, t)) (3)
with a known initial value ˆx(0) = ˆx0and a Kalman Gain
K(t) = ˆ
P(t)·ˆ
CT(t)·ˆ
R1(t)(4)
as well as its covariance matrix
˙
ˆ
P(t) = ˆ
A(t)ˆ
P(t) + ˆ
P(t)ˆ
A(t)T
ˆ
P(t)ˆ
CT(t)ˆ
R1(t)ˆ
C(t)ˆ
P(t) + ˆ
Q(t)(5)
with a known initial value ˆ
P(0) = ˆ
P0. In Eq. (4) and (5),
the involved matrices are obtained by linearization of Eq. (1)
according to
ˆ
A(t) = f(x,u,w, t)
x(t)
,ˆ
C(t) = g(x,u,v, t)
x(t)
,
ˆ
G(t) = f(x,u,w, t)
w(t)
,ˆ
V(t) = g(x,u,v, t)
v(t)
.
(6)
The -symbol denotes that the linearization is performed
repeatedly for each new mean value ˆx(t). This also allows
obtaining the noise covariance matrices
ˆ
Q(t) = ˆ
G(t)·Q(t)·ˆ
GT(t),
ˆ
R(t) = ˆ
V(t)·R(t)·ˆ
VT(t)(7)
in Eq. (4) and (5), respectively. Note that we omitted θ
from Eq. (1) for introducing the EKBF, since the filtering
problem does not aim at identifying parameters in a system,
but only enables us to calculate the state’s mean value and its
covariance matrix in a stochastic environment. The necessity
of performing a linearization for every new state usually
leads to a considerable computing effort which lowers the
attractiveness of the EKBF for many applications. But the
recent advances in automatic differentiation and its usage in
neural networks make this method perfect for our purpose.
B. Neural Networks
A neural network is a type of machine learning algorithm
that maps an input signal of rank ιto an output signal of rank
κby approximating a desired function. In its simplest form,
it consists of at least three layers: the input layer comprises ι
neurons and does not perform any transformations but only
distributes the input signal to the successive layer. The hidden
layer comprises a variable count of neurons. Each performs
a weighted, nonlinear transformation by means of
o=σ l
X
k=1
ikwk+w0!=σiT·w.(8)
Here, i= [1, i1, . . . , il]TRl+1 denotes the output of the
previous layer with lneurons and w= [w0, w1, . . . , wl]T
Rl+1 denotes a weighting vector. If the neural network is
used to approximate a nonlinear behavior, the activation
function σ(·) : RRneeds to be of nonlinear nature as
摘要:

Kalman-Bucy-InformedNeuralNetworkforSystemIdenticationTobiasNagelandMarcoF.HuberAbstract—Identifyingparametersinasystemofnonlinear,ordinarydifferentialequationsisvitalfordesigningarobustcontroller.However,ifthesystemisstochasticinitsnatureorifonlynoisymeasurementsareavailable,standardoptimizational...

展开>> 收起<<
Kalman-Bucy-Informed Neural Network for System Identification.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:学术论文 价格:10玖币 属性:6 页 大小:598.69KB 格式:PDF 时间:2025-04-15

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注