Kalman-Bucy-Informed Neural Network for System Identiﬁcation

2025-04-15 3 0 598.69KB 6 页 10玖币

侵权投诉

Tobias Nagel and Marco F. Huber

Abstract— Identifying parameters in a system of nonlinear,

ordinary differential equations is vital for designing a robust

controller. However, if the system is stochastic in its nature or if

only noisy measurements are available, standard optimization

algorithms for system identiﬁcation usually fail. We present a

new approach that combines the recent advances in physics-

informed neural networks and the well-known achievements of

Kalman ﬁlters in order to ﬁnd parameters in a continuous-time

system with noisy measurements. In doing so, our approach

allows estimating the parameters together with the mean value

and covariance matrix of the system’s state vector. We show

that the method works for complex systems by identifying the

parameters of a double pendulum.

I. INTRODUCTION

Controlling a dynamical system in a safe manner re-

quires a model that describes the system properties precisely.

Ordinary differential equations (ODEs) are often used to

satisfy this requirement. Besides setting up the corresponding

equation operators, it is also inevitable to identify the real-

valued coefﬁcients that deﬁne the characteristics of the

system. Estimating these parameters by using measurements

is termed as “inverse problem” and can be a difﬁcult task,

depending on the system’s complexity. This work presents a

new method that is capable of identifying unknown param-

eters in a nonlinear ODE system, based on noisy measure-

ments by using an extended Kalman-Bucy ﬁlter (EKBF) in

a machine learning framework.

For linear systems, the subspace-based state space identi-

ﬁcation methods are well established. They aim at ﬁnding a

linear state space model by using a regularized least-squares

algorithm [7]. If the system comprises nonlinear behavior,

the most straightforward solution approaches for parame-

ter identiﬁcation are standard minimization techniques like

gradient-based [9] or gradient-free [1] methods. For a system

with noisy measurements, the problem becomes even more

difﬁcult and requires incorporating stochastic moments in

the optimization. Raue et. al. summarize their experiences

of ﬁtting measurements of biological systems to their cor-

responding ODE system by maximizing a log-likelihood

function that comprises a normally distributed measurement

noise [13]. However, these methods require a numerical so-

lution of the ODE, repeatedly for each optimization iteration.

Besides being very time consuming, this approach often fails

Tobias Nagel and Marco F. Huber are with the Fraunhofer Institute for

Manufacturing Engineering and Automation IPA, Center for Cyber Cog-

nitive Intelligence (CCI), 70569 Stuttgart, Germany {tobias.nagel,

marco.huber}@ipa.fraunhofer.de

Marco F. Huber is with the Institute of Industrial Manufacturing

and Management IFF, University of Stuttgart, 70569 Stuttgart, Germany

marco.huber@ieee.org

because of the system’s nonlinearity, the noise inﬂuence or

an unstable behavior in the numerical solution [6].

A possibility to circumvent the problem with a machine

learning approach is described in [16] by using a neural

network to improve a Kalman ﬁlter system in order to obtain

a better state estimate. Though, this does not give us the

actual system parameter values but compensates for model

errors. In 2017, Raissi et. al. presented how physics-informed

neural networks (PINNs) can be trained by using modern

automatic differentiation frameworks [12]. The approach

utilizes deep neural networks to discover and solve nonlinear

differential equation systems. This is achieved by training a

neural network to represent an approximate solution to the

differential equation. The method also enables a parameter

search, by including the unknown parameters as additional

network weights. The concept has been applied in numerous

research ﬁelds, e.g., mechanics [8], thermodynamics [10] or

in chemical reaction equations [4]. PINNs also enable the

possibility to include stochastic behavior in the modeling

process. Recently, this has been addressed by O’Leary et. al.

who incorporate a mean value of the state and its covariance

matrix in the framework, leveraging it to a stochastic physics-

informed neural network (SPINN) [11]. The authors do so by

propagating the ﬁrst two central moments of a state variable

through the known differential equation systems. Afterwards

a neural network is trained in order to match these estimated

central moments to measured ones. However, the authors

do not address the problem of identifying parameters in the

system. Another option is to use a Bayesian neural network

(BNN) in a PINN environment which allows an embedding

of uncertainty and, hence, the usage of stochastic differential

equations. Yang et. al. use a BNN [18] to include noisy data

into a partial differential equation problem in order to solve

as well as identify the system [18]. However, BNNs are often

not capable of achieving the same approximation accuracy as

standard neural networks and are signiﬁcantly more difﬁcult

to train.

In this paper, we present a new physics-informed machine

learning approach that we call Kalman-Bucy-informed neural

network (KBINN). A Kalman-Bucy ﬁlter incorporates two

ODEs that describe the temporal evolution of the mean

value and the covariance matrix of the system’s state. In

our method, we include two neural networks that are im-

plemented in a PINN framework in order to approximate

a solution to the Kalman-Bucy equations. This allows an

implicit identiﬁcation of unknown system parameters by

incorporating them into the network training. The rest of the

paper is organized as follows: In Section II, we give a short

mathematical formulation of the problem. Section III intro-

arXiv:2210.03424v1 [eess.SY] 7 Oct 2022

duces the extended Kalman-Bucy ﬁlter (EKBF) and gives

a short summary of neural networks. Section IV describes

the KBINN method, followed by validation experiments in

Section V. We discuss the strengths and limitations of our

method in Section VI and close the paper with a conclusion

in Section VII.

II. PROBLEM FORMULATION

The state space representation of a continuous-time, non-

linear, dynamic and time-variant system of rank n∈Nis

deﬁned by means of

˙x(t) = f(x(t),u(t),w(t), t, θ)

y(t) = g(x(t),u(t),v(t), t),(1)

where f(·)is the nonlinear ODE and g(·)is the measurement

function. Both are assumed to be known, except for a set of

unknown parameters. Furthermore, x(t)∈Rnwith t≥0

denotes the state vector, u(t)∈Rpand y(t)∈Rqdenote

the input and output signal with dimensions p, q ∈N,

respectively. The vectors w(t)∈Rnand v(t)∈Rqdenote

white process noise and white measurement noise, respec-

tively, which are both assumed to be zero-mean Gaussian

with covariance matrices Q(t)∈Rn×nand R(t)∈Rq×q,

respectively. This induces the state x(t)to be a random

variable as well. θ∈Rddenotes a vector of d∈N

unknown parameters of the ODE system. If we acquire noisy

measurements y(ti)of the system at Ndiscrete time steps

tiwith i= 1, . . . , N, the aim of our method is to ﬁnd a

parameter vector θ∗that minimizes the objective function

θ∗= arg min

θ(y(ti)−y(ti))2.(2)

Standard minimization algorithms require to solve the ODE

in Eq. (1) at each iteration numerically in order to minimize

Eq. (2). As has been mentioned in Section I, these algorithms

often fail due to the necessary small step width which is

imposed by the system’s nonlinearity or the noise.

III. PRELIMINARIES

In this section, we brieﬂy introduce the EKBF, which

is based on the well-known Kalman ﬁlter, applied to the

nonlinear and continuous-time case. Afterwards, we give a

short introduction to artiﬁcial neural networks.

A. Extended Kalman-Bucy Filter

It is obvious from Eq. (1) that the state xis not di-

rectly measurable. To compensate for this problem, Rudolf

E. Kalman introduced in 1960 the Kalman ﬁlter ﬁrst

for discrete-time, linear systems and a year later for the

continuous-time case, together with Richard S. Bucy [5].

This concept allows estimating the mean value and the co-

variance matrix of the system’s state. If the noise is assumed

to be Gaussian, the ﬁrst two central moments are sufﬁcient

to describe the state’s probability distribution exactly. If a

nonlinear system is considered, it is necessary to perform

linearizations at each time step which leads to the EKBF. It

is composed of two initial value problems that comprises the

state’s estimated mean value

ˆx(t) = f(ˆx(t),u(t),0, t)

+K(t)·(y(t)−g(ˆx(t),u(t),0, t)) (3)

with a known initial value ˆx(0) = ˆx0and a Kalman Gain

K(t) = ˆ

P(t)·ˆ

CT(t)·ˆ

R−1(t)(4)

as well as its covariance matrix

P(t) = ˆ

A(t)ˆ

P(t) + ˆ

P(t)ˆ

A(t)T

−ˆ

P(t)ˆ

CT(t)ˆ

R−1(t)ˆ

C(t)ˆ

P(t) + ˆ

Q(t)(5)

with a known initial value ˆ

P(0) = ˆ

P0. In Eq. (4) and (5),

the involved matrices are obtained by linearization of Eq. (1)

according to

A(t) = ∂f(x,u,w, t)

∂x(t)∧

,ˆ

C(t) = ∂g(x,u,v, t)

∂x(t)∧

G(t) = ∂f(x,u,w, t)

∂w(t)∧

,ˆ

V(t) = ∂g(x,u,v, t)

∂v(t)∧

(6)

The ∧-symbol denotes that the linearization is performed

repeatedly for each new mean value ˆx(t). This also allows

obtaining the noise covariance matrices

Q(t) = ˆ

G(t)·Q(t)·ˆ

GT(t),

R(t) = ˆ

V(t)·R(t)·ˆ

VT(t)(7)

in Eq. (4) and (5), respectively. Note that we omitted θ

from Eq. (1) for introducing the EKBF, since the ﬁltering

problem does not aim at identifying parameters in a system,

but only enables us to calculate the state’s mean value and its

covariance matrix in a stochastic environment. The necessity

of performing a linearization for every new state usually

leads to a considerable computing effort which lowers the

attractiveness of the EKBF for many applications. But the

recent advances in automatic differentiation and its usage in

neural networks make this method perfect for our purpose.

B. Neural Networks

A neural network is a type of machine learning algorithm

that maps an input signal of rank ιto an output signal of rank

κby approximating a desired function. In its simplest form,

it consists of at least three layers: the input layer comprises ι

neurons and does not perform any transformations but only

distributes the input signal to the successive layer. The hidden

layer comprises a variable count of neurons. Each performs

a weighted, nonlinear transformation by means of

o=σ l

k=1

ikwk+w0!=σiT·w.(8)

Here, i= [1, i1, . . . , il]T∈Rl+1 denotes the output of the

previous layer with lneurons and w= [w0, w1, . . . , wl]T∈

Rl+1 denotes a weighting vector. If the neural network is

used to approximate a nonlinear behavior, the activation

function σ(·) : R→Rneeds to be of nonlinear nature as

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Kalman-Bucy-InformedNeuralNetworkforSystemIdenticationTobiasNagelandMarcoF.HuberAbstractIdentifyingparametersinasystemofnonlinear,ordinarydifferentialequationsisvitalfordesigningarobustcontroller.However,ifthesystemisstochasticinitsnatureorifonlynoisymeasurementsareavailable,standardoptimizational...

展开>> 收起<<

Kalman-Bucy-Informed Neural Network for System Identiﬁcation.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Kalman-Bucy-Informed Neural Network for System Identiﬁcation

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: