Safe and Efﬁcient Switching Mechanism Design for Uncertiﬁed Linear Controller Yiwen Lu and Yilin Mo

2025-05-03 0 0 426.87KB 8 页 10玖币

侵权投诉

Safe and Efﬁcient Switching Mechanism Design for

Uncertiﬁed Linear Controller

Yiwen Lu and Yilin Mo

Abstract—Sustained research efforts have been devoted to learning

optimal controllers for linear stochastic dynamical systems with unknown

parameters, but due to the corruption of noise, learned controllers are

usually uncertiﬁed in the sense that they may destabilize the system.

To address this potential instability, we propose a “plug-and-play”

modiﬁcation to the uncertiﬁed controller which falls back to a known

stabilizing controller when the norm of the difference between the

uncertiﬁed and the fall-back control input exceeds a certain threshold.

We show that the switching strategy is both safe and efﬁcient, in the

sense that: 1) the linear-quadratic cost of the system is always bounded

even if original uncertiﬁed controller is destabilizing; 2) in case the

uncertiﬁed controller is stabilizing, the performance loss caused by

switching converges super-exponentially to 0for Gaussian noise, while

the converging polynomially for general heavy-tailed noise. Finally, we

demonstrate the effectiveness of the proposed switching strategy via

numerical simulation on the Tennessee Eastman Process.

I. INTRODUCTION

Learning a controller from noisy data for an unknown system

has been a central topic to adaptive control and reinforcement

learning [1], [2], [3], [4] for the past decades. A main challenge

to directly applying the learned controllers to the system is that

they are usually uncertiﬁed, in the sense it can be very difﬁcult

to guarantee the stability of such controllers due to process and

measurement noise. One way to address this challenge is to deploy an

additional safeguard mechanism. In particular, assuming the existence

of a known stabilizing controller, empirically the safeguard may be

implemented by falling back to the stabilizing controller from the

uncertiﬁed controller, when potential safety breach is detected.

Motivated by the above intuition, this paper proposes such a

switching strategy, provides a formal safety guarantee and quan-

tiﬁes the performance loss incurred by the safeguard mechanism,

for discrete-time Linear-Quadratic Regulation (LQR) setting with

independent and identically distributed process noise with bounded

fourth-order moment. We assume the existence of a known stabilizing

linear feedback control law u=K0x, which can be achieved either

when the system is known to be open-loop stable (in which case

K0= 0), or through adaptive stabilization methods [5], [6]. Given

an uncertiﬁed linear feedback control gain K1, a modiﬁcation to the

control law u=K1xis proposed: the controller normally applies

u=K1x, but falls back to u=K0xfor tconsecutive steps once

k(K1−K0)xkexceeds a threshold M. The proposed strategy is

analyzed from both stability and optimality aspects. In particular, the

main results include:

1) We prove the LQ cost of the proposed controller is always

bounded, even if K1is destabilizing. This fact implies that

the proposed strategy enhances the safety of the uncertiﬁed

controller by preventing the system from being catastrophically

destabilized.

2) Provided K1is stabilizing, and M, t are chosen properly, we

compare the LQ cost of the proposed strategy with that of the

This work is supported by the National Key Research and Develop-

ment Program of China under Grant 2018AAA0101601. The authors are

with the Department of Automation and BNRist, Tsinghua University,

Beijing, P.R.China. Emails: luyw20@mails.tsinghua.edu.cn,

ylmo@tsinghua.edu.cn.

linear feedback control law u=K1x, and quantify the maxi-

mum increase in LQ cost caused by switching w.r.t. the strategy

hyper-parameters M, t as merely O(t1/4exp(−constant·M2))

in the case of Gaussian process noise, which decays super-

exponentially as the switching threshold Mtends to inﬁnity. We

also discuss the extension to general noise distributions with

bounded fourth-order moments, where the above asymptotic

performance gap becomes O(t1/4M−1).

The performance of the proposed switching scheme is further vali-

dated by simulation on the Tennessee Eastman Process example. We

envision that the switching framework could be potentially applicable

in a wider range of learning-based control settings, since it may

combine the good empirical performance of learned policies and

the stability guarantees of classical controllers, and the “plug-and-

play” nature of the switching logic may minimize the required

modiﬁcations to existing learning schemes.

A preliminary version of this paper [7] has been submitted to IEEE

CDC 2022. The main contributions of the current manuscript over

the conference submission are: i) the switching scheme has been

redesigned, such that the upper bound on LQ cost (Theorem 2) no

longer depends on K1; ii) the conclusions have been extended to

noise distributions with bounded fourth-order moments; iii) proofs

of all theoretical results are included in the current version of the

manuscript.

Related Works

Switched control systems: Supervisory algorithms have been

developed to stabilize switched linear systems [8], [9], [10], and other

nonlinear systems that are difﬁcult to stabilize globally with a single

controller [11], [12], [13]. However, most of the paper focuses on the

stability of the switched system, while the (near-)optimality of the

controllers are less discussed. Building upon this vein of literature,

the idea of switching between certiﬁed and uncertiﬁed controllers to

improve performance was proposed in [14], whose scheme guarantees

global stability for general nonlinear systems under mild assumptions.

However, no quantitative analysis of the performance under switching

is provided. In contrast, we specialize our results for linear systems

and prove that switching may induce only negligible performance

loss while ensuring safety.

Adaptive LQR: Adaptive and learned LQR has drawn signiﬁcant

research attention in recent years, for which high-probability estima-

tion error and regret bounds have been proved for methods including

optimism-in-face-of-uncertainty [15], [16], thompson sampling [17],

policy gradient [18], robust control based on coarse identiﬁcation [19]

and certainty equivalence [20], [21], [22], [23]. All the above

approaches, however, involve applying a linear controller learned

from ﬁnite noise-corrupted data, which has a nonzero probability

of being destabilizing. Furthermore, given a ﬁxed length of data, the

failure probabilities of the aforementioned methods depend on either

unknown system parameters or statistics of online data, which implies

the failure probability cannot be determined a priori, and hence it

can be challenging to design an algorithm that strictly satisﬁes a pre-

deﬁned speciﬁcation of safety. In [24], a “cutoff” method similar to

arXiv:2210.14595v1 [eess.SY] 26 Oct 2022

the switching strategy described in the present paper is applied in an

attempt to establish almost sure guarantees for adaptive LQR, which

are nevertheless asymptotic in nature, and the extra cost caused by

switching is not analyzed. By contrast, this manuscript provides both

non-asymptotic and asymptotic bounds for the switching strategy.

Nonlinear controller for LQR: Nonlinearity in the control of

linear systems has been studied mainly due to practical concerns such

as saturating actuators. The performance of LQR under saturation

nonlinearity has been studied in [25], [26], [27], which are all based

on stochastic linearization, a heuristics that replaces nonlinearity with

approximately equivalent gain and bias. By contrast, the present paper

treats nonlinearity as a design choice rather than a physical constraint,

and provides rigorous performance bounds without resorting to any

heuristics.

Outline

The remainder of this paper is organized as follows: Section II

introduces the problem setting and describes the proposed switching

strategy. The main results are provided in Section III and Section IV

for Gaussian process noise and noise with bounded fourth-order

moments respectively. Section V validates the performance of the

proposed strategy with a industrial process example. Finally, Sec-

tion VI concludes the paper.

Notations

The set of nonnegative integers are denoted by N, and the set

of positive integers are denoted by N∗. For a square matrix M,

ρ(M)denotes the spectral radius of M, and tr(M)denotes the

trace of M. For a real symmetric matrix M,M0denotes

that Mis positive deﬁnite. kvkdenotes the 2-norm of a vector v

and kMkis the induced 2-norm of the matrix M, i.e., its largest

singular value. For P0,hv, wiP=vTP w is the P-inner

product of vectors v, w, and kvkP=kP1/2vkis the P-norm of

a vector v. For two positive semideﬁnite matrices P0, Q 0,

kQkP=kP−1/2QP −1/2k= supkvkP=1 kvk2

Q. For a random

vector X,X∼ N (µ, Σ) denotes Xis Gaussian distributed with

mean µand covariance Σ.P(·)denotes the probability operator, E(·)

denotes the expectation operator, and 1Eis the indicator function

of the random event E. For functions f(x), g(x)with non-negative

values, f(x) = O(g(x)) means lim supx→∞ f(x)/g(x)<∞, and

f(x) = Θ(g(x)) means f(x) = O(g(x)) and g(x) = O(f(x)).

II. PROBLEM FORMULATION AND PROPOSED SWITCHING

STRATEGY

Consider the following discrete-time linear plant:

xk+1 =Axk+Buk+wk,(1)

where k∈Nis the time index, xk∈Rnis the state vector, uk∈Rm

is the input vector, and wk∈Rnis the process noise. Without loss

of generality, the system is assumed to be controllable. We further

assume that the initial state x0= 0, and that {wk}are independent

and identically distributed with covariance matrix W0.

We measure the performance of a controller uk(x0:k)in terms of

the inﬁnite-horizon quadratic cost deﬁned as:

J= lim sup

T→∞

TE"T−1

k=0

kQxk+uT

kRuk#,(2)

where Q0, R 0are ﬁxed weight matrices speciﬁed by the

system operator. It is well known that the optimal controller is the

linear feedback controller of the form u(x) = K∗x, where the

optimal gain K∗can be determined by solving the discrete-time

algebraic Riccati equation.

In this paper, we assume that the system and input matrices A, B

are unavailable to the system operator, and hence she cannot deter-

mine the optimal feedback gain K∗. Instead, she has the following

two feedback gains:

•Primary gain K1, typically learned from data, which can be

close to K∗but does not have stability guarantees;

•Fallback gain K0, which is typically conservative but always

guaranteed to be stabilizing, i.e., ρ(A+BK0)<1.

Ideally, the system operator would want to use K1as much as

possible, as it usually admits a better performance. However, since

K1is not necessarily stabilizing, a switching strategy is deployed in

pursuit of both safety and performance of the system. The block

diagram of the closed-loop system under the proposed switching

strategy is shown in Fig. 1, and the switching logic is described in

Algorithm 1. In plain words, the proposed switched control strategy

is normally applying u=K1x, while falling back to u=K0xfor

tconsecutive steps once k(K1−K0)xkexceeds a threshold M.

Switching

logic

ξ > 0

z−1

ξ= 0 Plant

(eq. (1))

z−1

Switched Control Strategy

(Algorithm 1)

ξw

Fig. 1: Block diagram of the closed-loop system under the proposed

switching strategy. The controller selects u=K1xwhen ξ= 0 and

u=K0xwhen ξ > 0, where ξis an internal counter determined by

the switching logic.

Algorithm 1 Proposed switched control strategy

Input: Current state x, primary gain K1, fallback gain K0, current

counter value ξ, switching threshold M, dwell time t

Output: Control input u, next counter value ξ0

1: if ξ > 0then

2: u←K0x

3: else

4: if k(K1−K0)xk ≥ Mthen

5: ξ←t, u ←K0x

6: else

7: u←K1x

8: ξ0←max{ξ−1,0}

III. MAIN THEORETICAL RESULTS

This section is devoted to proving the stability of the proposed

switching strategy as well as quantifying performance loss it incurs.

It is assumed throughout this section that the process noise obeys a

Gaussian distribution, i.e., wk∼ N (0, W ).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SafeandEfcientSwitchingMechanismDesignforUncertiedLinearControllerYiwenLuandYilinMoAbstractSustainedresearcheffortshavebeendevotedtolearningoptimalcontrollersforlinearstochasticdynamicalsystemswithunknownparameters,butduetothecorruptionofnoise,learnedcontrollersareusuallyuncertiedinthesensethatt...

展开>> 收起<<

Safe and Efﬁcient Switching Mechanism Design for Uncertiﬁed Linear Controller Yiwen Lu and Yilin Mo.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Safe and Efﬁcient Switching Mechanism Design for Uncertiﬁed Linear Controller Yiwen Lu and Yilin Mo

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: