Self-Tuning Tube-based Model Predictive Control Damianos Tranos Alessio Russo and Alexandre Proutiere Abstract We present Self-Tuning Tube-based Model Predic-

2025-05-03 0 0 580.94KB 7 页 10玖币

侵权投诉

Self-Tuning Tube-based Model Predictive Control

Damianos Tranos, Alessio Russo, and Alexandre Proutiere

Abstract— We present Self-Tuning Tube-based Model Predic-

tive Control (STT-MPC), an adaptive robust control algorithm

for uncertain linear systems with additive disturbances based on

the least-squares estimator and polytopic tubes. Our algorithm

leverages concentration results to bound the system uncertainty

set with prescribed conﬁdence, and guarantees robust con-

straint satisfaction for this set, along with recursive feasibility

and input-to-state stability. Persistence of excitation is ensured

without compromising the algorithm’s asymptotic performance

or increasing its computational complexity. We demonstrate the

performance of our algorithm using numerical experiments.

I. INTRODUCTION

Model Predictive Control (MPC) [1] addresses the inﬁnite

horizon optimal control problem in the presence of input

and state constraints by approximating it as a sequence of

ﬁnite horizon optimization problems. When the dynamics of

the system are uncertain, robust MPC methods [2] can be

employed to ensure constraint satisfaction for pre-speciﬁed

sets of system parameters and disturbances. This robustness

comes at the cost of a reduced closed-loop performance. To

mitigate this performance loss, one may leverage adaptive

control techniques [3] to learn the system dynamics in an

online manner, and in turn reduce the uncertainty causing

this loss.

Adaptive control schemes have been developed and stud-

ied mainly for unconstrained control problems. The ﬁrst

results [4] concerned their asymptotic convergence properties

and established conditions under which the controller derived

from these schemes actually approaches the optimal feedback

controller obtained by assuming the full knowledge on the

system dynamics. More recently, see e.g. [5]–[7], researchers

managed to quantify the convergence rate of some classical

adaptive control schemes, such the celebrated self-tuning

regulators, as well as the price that one has to pay (in

terms of cumulative losses – often captured through the

notion of regret) to learn the system dynamics. These recent

important results are however restricted to the LQR problems

in unconstrained linear systems.

In this paper, we investigate the design and the perfor-

mance analysis of MPC schemes handling system dynamics

uncertainties. We propose to combine adaptive and robust

control methods. The adaptive control component of our

schemes allows to rapidly reduce over time the uncertainties

This work was supported by the Wallenberg AI, Autonomous Systems

and Software Program (WASP) funded by the Knut and Alice Wallenberg

Foundation.

D. Tranos, A. Russo, and A. Proutiere are with the Division of Decision

and Control Systems, School of Electrical Engineering and Computer

Science, Royal Institute of Technology (KTH), Stockholm, Sweden. Emails:

{tranos@kth.se, alessior@kth.se, alepro@kth.se }.

in a controlled and quantiﬁable manner, whereas the robust

control component ensures constraint satisfaction. More pre-

cisely, our contributions are as follows.

Contributions. We address the problem of controlling an

uncertain linear system with parametric and additive distur-

bances, subject to deterministic constraints. We present Self-

Tuning Tube-based Model Predictive Control (STT-MPC), an

algorithm combining adaptive and robust control techniques.

STT-MPC uses a simple Least Squares Estimator (LSE) for

estimating the dynamics and a parameter set compatible with

the observations with prescribed level of certainty. We derive

tight concentration results (similar to those presented in [7]

for the LQR problem) for this set, and exploit these results

to construct ﬁxed complexity polyhedral approximations of

it. In STT-MPC, these approximations are used to build a

polytopic tube MPC scheme [8] to ensure robust constraint

satisfaction. We establish the recursive feasibility and the

input-to-state stability of the proposed scheme.

In contrast to previously proposed robust MPC schemes

(refer to §II for details), STT-MPC enjoys the following

properties. (i) STT-MPC uses a probabilistic rather than ro-

bust estimation scheme, leading to notably faster adaptation

rates and smaller parameter sets for which the constraints

need to be satisﬁed. (ii) Persistence of Excitation (PE),

required to get performance guarantees for the LSE, is

achieved without modifying either the cost or objective

function of the MPC. The added excitation is treated as an

additive disturbance and can be chosen to decay to zero with

time (e.g., at a rate 1/√t). This allows us to asymptotically

recover the performance of the standard MPC algorithm

with full knowledge of the dynamics. (iii) An input-to-state

stability analysis of STT-MPC is possible even when the

PE condition is active and without imposing any additional

restrictions on the choice of the estimate of the nominal

model parameter.

Notations. For a time dependent vector xt, we denote by

xk|tits prediction at time k+tgiven information at time

t. For any two sets Aand B, we deﬁne their Minkowski

sum as the set A ⊕ B := {a+b:a∈ A, b ∈ B}. We

also deﬁne, for any constant λ≥0, the scaled set λA:=

{λa, a ∈ A}.For any d∈N,x∈Rd, and  > 0let B(x, )

denote the ball of the spectral norm centered on x. The

unit ball centered at the origin is denoted simply as B. For

any set S, and any ε > 0there exists a polytope Pthat is

an outer approximation of S, i.e. S ⊂ P ⊕ εB. We refer

to this polytope as the outer polyhedral approximation of S.

Finally, a function κ:R+→R+is of class Kif it is strictly

increasing and κ(0) = 0 and is of class K∞if in addition

arXiv:2210.00502v1 [eess.SY] 2 Oct 2022

κ(x)→ ∞ as x→ ∞.

II. RELATED WORK

An essential ingredient of adaptive control is Persistent

Excitation (PE) [9] (which is equivalent to the notion of

required exploration in reinforcement learning [10]). In the

context of MPC, different solutions for PE have been in-

vestigated: [11] introduces a sufﬁcient PE condition as an

additional constraint in their MPC optimization. On the other

hand, [12], [13], and later [14], propose dual-MPC schemes

where the predicted parameter covariance matrix is included

in the cost function, though these dual-control methods lack

feasibility and stability guarantees.

In a different direction, the works of [15] and [16]

introduce robust MPC schemes with guaranteed recursive

feasibility and constraint satisfaction. These schemes come

with the cost of only considering adaptation for the nominal

model, where PE is not ensured, and robustness is guaranteed

for a ﬁxed parameter set that is not updated online. In

contrast, our scheme, STT-MPC, updates the parameter set

each step, and the latter rapidly concentrates around the true

paraemter.

In [17], the authors combine set-membership identiﬁca-

tion, which involves updating a set of parameters compatible

with the observed state-trajectory, and robust constraint tight-

ening for FIR models. This approach was later extended in

[18] to the case of linear state-space models where online

set-membership is combined with homothetic tube MPC

[19]. Subsequently, [20] proposed an adaptive robust MPC

scheme using again set-membership identiﬁcation and a less

conservative polytopic tube MPC scheme [8]. As in [11],

they ensure PE in the form of added convex constraints and

a modiﬁcation of the cost function.

III. MODEL, OBJECTIVE,AND APPROACH

A. Model and assumptions

We consider the following discrete time, linear, time-

invariant system:

xt+1 =A(θ?)xt+B(θ?)ut+wt,(1)

where xt, wt∈Rdxand ut∈Rdu. The state transition and

state-action transition matrices A(θ?)and B(θ?)are initially

unknown. The set of possible such matrices is parametrized

by θ∈Rdθ(here θcould well parametrize each entry of

the matrices, in which case dθ=dx(dx+du)). To simplify

the notations, for two possible parameters θ1, θ2, we deﬁne

kθ1−θ2k:= max(kA(θ1)−A(θ2)k2,kB(θ1)−B(θ2)k2).

We make the following assumptions.

Assumption 1 (Parameter uncertainty).For some 0>0,

B(θ∗, 0)⊂Θ0where Θ0is a known convex polytope.

Assumption 2 (Additive disturbance).The sequence (wt)t≥0

is i.i.d, and for each t≥0,wtis zero-mean, isotropic,

with support in the ball B(0,3σ). Hence, wtis σ2-sub-

gaussian. Further deﬁne W, a convex polytope providing a

conservative approximation of B(0,3σ), i.e., B(0,3σ)⊂ W.

The system also needs to obey the following state and

input constraints for all t≥0,

F xt+Gut≤1,(2)

where F∈Rdc×dxand G∈Rdc×dudeﬁne the state and

input constraints respectively.

Assumption 3 (State and input constraints).The set

C={(x, u)∈Rdx×Rdu:F x +Gu ≤1.}

is compact and contains the origin in its interior.

Finally, we assume that we have access to a stabilizer K:

Assumption 4 (Stabilizing Controller).There exists a

known, robustly stabilizing feedback gain Ksuch that A(θ)+

B(θ)Kis stable (i.e., ρ(A(θ)+B(θ)K)<1) for all θ∈Θ0.

B. Objective and MPC

We wish to minimize the long-term cost deﬁned as

lim supT→∞ 1

TPT−1

t=0 x>

tQxt+u>

tRut, through some pos-

itive semi-deﬁnite matrices Q, R. To this aim, we use MPC,

with a receding horizon N. Speciﬁcally, at time t, given

the current system state xtand the past observations used to

derive an estimator θtof θ?, we will identify a control policy

(uk|t)k=0,...,N−1minimizing the cost along a predicted sys-

tem trajectory (xk|t)k=0,...,N . We use the well-known dual

mode prediction paradigm [21] with the following predicted

control sequence,

uk|t=(Kxk|t+vk|t∀k∈ {0, . . . , N −1},

Kxk|t∀k≥N, (3)

where {v0|t,...vN−1|t}are the optimization variables to be

determined by the MPC. The resulting prediction dynamics

will be for k∈ {0, . . . , N −1},

x0|t=xt,(4a)

xk+1|t= Φ(θt)xk|t+B(θt)vk|t,(4b)

where Φ(θ) := A(θ) + B(θ)Kfor any θ∈Θ0.

C. General approach

To handle the uncertainty due to both the noise and the fact

that θ?is unknown, we apply a tube-based MPC approach.

The tube used in step tis essentially constructed from a

polytope approximating a ball centered at the LSE ˆ

θtof

θ?and whose radius corresponds to a prescribed level of

conﬁdence δthat the user wishes to guarantee. Persistent

excitation is achieved by adding (bounded) noise to the input.

The resulting algorithm is presented in Algorithm 1, and its

ingredients are detailed in the next section. Its analysis is

given in Section V.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Self-TuningTube-basedModelPredictiveControlDamianosTranos,AlessioRusso,andAlexandreProutiereAbstractWepresentSelf-TuningTube-basedModelPredic-tiveControl(STT-MPC),anadaptiverobustcontrolalgorithmforuncertainlinearsystemswithadditivedisturbancesbasedontheleast-squaresestimatorandpolytopictubes.Oural...

展开>> 收起<<

Self-Tuning Tube-based Model Predictive Control Damianos Tranos Alessio Russo and Alexandre Proutiere Abstract We present Self-Tuning Tube-based Model Predic-.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Self-Tuning Tube-based Model Predictive Control Damianos Tranos Alessio Russo and Alexandre Proutiere Abstract We present Self-Tuning Tube-based Model Predic-

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: