Self-Tuning Tube-based Model Predictive Control Damianos Tranos Alessio Russo and Alexandre Proutiere Abstract We present Self-Tuning Tube-based Model Predic-

2025-05-03 0 0 580.94KB 7 页 10玖币
侵权投诉
Self-Tuning Tube-based Model Predictive Control
Damianos Tranos, Alessio Russo, and Alexandre Proutiere
Abstract We present Self-Tuning Tube-based Model Predic-
tive Control (STT-MPC), an adaptive robust control algorithm
for uncertain linear systems with additive disturbances based on
the least-squares estimator and polytopic tubes. Our algorithm
leverages concentration results to bound the system uncertainty
set with prescribed confidence, and guarantees robust con-
straint satisfaction for this set, along with recursive feasibility
and input-to-state stability. Persistence of excitation is ensured
without compromising the algorithm’s asymptotic performance
or increasing its computational complexity. We demonstrate the
performance of our algorithm using numerical experiments.
I. INTRODUCTION
Model Predictive Control (MPC) [1] addresses the infinite
horizon optimal control problem in the presence of input
and state constraints by approximating it as a sequence of
finite horizon optimization problems. When the dynamics of
the system are uncertain, robust MPC methods [2] can be
employed to ensure constraint satisfaction for pre-specified
sets of system parameters and disturbances. This robustness
comes at the cost of a reduced closed-loop performance. To
mitigate this performance loss, one may leverage adaptive
control techniques [3] to learn the system dynamics in an
online manner, and in turn reduce the uncertainty causing
this loss.
Adaptive control schemes have been developed and stud-
ied mainly for unconstrained control problems. The first
results [4] concerned their asymptotic convergence properties
and established conditions under which the controller derived
from these schemes actually approaches the optimal feedback
controller obtained by assuming the full knowledge on the
system dynamics. More recently, see e.g. [5]–[7], researchers
managed to quantify the convergence rate of some classical
adaptive control schemes, such the celebrated self-tuning
regulators, as well as the price that one has to pay (in
terms of cumulative losses – often captured through the
notion of regret) to learn the system dynamics. These recent
important results are however restricted to the LQR problems
in unconstrained linear systems.
In this paper, we investigate the design and the perfor-
mance analysis of MPC schemes handling system dynamics
uncertainties. We propose to combine adaptive and robust
control methods. The adaptive control component of our
schemes allows to rapidly reduce over time the uncertainties
This work was supported by the Wallenberg AI, Autonomous Systems
and Software Program (WASP) funded by the Knut and Alice Wallenberg
Foundation.
D. Tranos, A. Russo, and A. Proutiere are with the Division of Decision
and Control Systems, School of Electrical Engineering and Computer
Science, Royal Institute of Technology (KTH), Stockholm, Sweden. Emails:
{tranos@kth.se, alessior@kth.se, alepro@kth.se }.
in a controlled and quantifiable manner, whereas the robust
control component ensures constraint satisfaction. More pre-
cisely, our contributions are as follows.
Contributions. We address the problem of controlling an
uncertain linear system with parametric and additive distur-
bances, subject to deterministic constraints. We present Self-
Tuning Tube-based Model Predictive Control (STT-MPC), an
algorithm combining adaptive and robust control techniques.
STT-MPC uses a simple Least Squares Estimator (LSE) for
estimating the dynamics and a parameter set compatible with
the observations with prescribed level of certainty. We derive
tight concentration results (similar to those presented in [7]
for the LQR problem) for this set, and exploit these results
to construct fixed complexity polyhedral approximations of
it. In STT-MPC, these approximations are used to build a
polytopic tube MPC scheme [8] to ensure robust constraint
satisfaction. We establish the recursive feasibility and the
input-to-state stability of the proposed scheme.
In contrast to previously proposed robust MPC schemes
(refer to §II for details), STT-MPC enjoys the following
properties. (i) STT-MPC uses a probabilistic rather than ro-
bust estimation scheme, leading to notably faster adaptation
rates and smaller parameter sets for which the constraints
need to be satisfied. (ii) Persistence of Excitation (PE),
required to get performance guarantees for the LSE, is
achieved without modifying either the cost or objective
function of the MPC. The added excitation is treated as an
additive disturbance and can be chosen to decay to zero with
time (e.g., at a rate 1/t). This allows us to asymptotically
recover the performance of the standard MPC algorithm
with full knowledge of the dynamics. (iii) An input-to-state
stability analysis of STT-MPC is possible even when the
PE condition is active and without imposing any additional
restrictions on the choice of the estimate of the nominal
model parameter.
Notations. For a time dependent vector xt, we denote by
xk|tits prediction at time k+tgiven information at time
t. For any two sets Aand B, we define their Minkowski
sum as the set A ⊕ B := {a+b:a∈ A, b ∈ B}. We
also define, for any constant λ0, the scaled set λA:=
{λa, a ∈ A}.For any dN,xRd, and  > 0let B(x, )
denote the ball of the spectral norm centered on x. The
unit ball centered at the origin is denoted simply as B. For
any set S, and any ε > 0there exists a polytope Pthat is
an outer approximation of S, i.e. S P εB. We refer
to this polytope as the outer polyhedral approximation of S.
Finally, a function κ:R+R+is of class Kif it is strictly
increasing and κ(0) = 0 and is of class Kif in addition
arXiv:2210.00502v1 [eess.SY] 2 Oct 2022
κ(x)→ ∞ as x→ ∞.
II. RELATED WORK
An essential ingredient of adaptive control is Persistent
Excitation (PE) [9] (which is equivalent to the notion of
required exploration in reinforcement learning [10]). In the
context of MPC, different solutions for PE have been in-
vestigated: [11] introduces a sufficient PE condition as an
additional constraint in their MPC optimization. On the other
hand, [12], [13], and later [14], propose dual-MPC schemes
where the predicted parameter covariance matrix is included
in the cost function, though these dual-control methods lack
feasibility and stability guarantees.
In a different direction, the works of [15] and [16]
introduce robust MPC schemes with guaranteed recursive
feasibility and constraint satisfaction. These schemes come
with the cost of only considering adaptation for the nominal
model, where PE is not ensured, and robustness is guaranteed
for a fixed parameter set that is not updated online. In
contrast, our scheme, STT-MPC, updates the parameter set
each step, and the latter rapidly concentrates around the true
paraemter.
In [17], the authors combine set-membership identifica-
tion, which involves updating a set of parameters compatible
with the observed state-trajectory, and robust constraint tight-
ening for FIR models. This approach was later extended in
[18] to the case of linear state-space models where online
set-membership is combined with homothetic tube MPC
[19]. Subsequently, [20] proposed an adaptive robust MPC
scheme using again set-membership identification and a less
conservative polytopic tube MPC scheme [8]. As in [11],
they ensure PE in the form of added convex constraints and
a modification of the cost function.
III. MODEL, OBJECTIVE,AND APPROACH
A. Model and assumptions
We consider the following discrete time, linear, time-
invariant system:
xt+1 =A(θ?)xt+B(θ?)ut+wt,(1)
where xt, wtRdxand utRdu. The state transition and
state-action transition matrices A(θ?)and B(θ?)are initially
unknown. The set of possible such matrices is parametrized
by θRdθ(here θcould well parametrize each entry of
the matrices, in which case dθ=dx(dx+du)). To simplify
the notations, for two possible parameters θ1, θ2, we define
kθ1θ2k:= max(kA(θ1)A(θ2)k2,kB(θ1)B(θ2)k2).
We make the following assumptions.
Assumption 1 (Parameter uncertainty).For some 0>0,
B(θ, 0)Θ0where Θ0is a known convex polytope.
Assumption 2 (Additive disturbance).The sequence (wt)t0
is i.i.d, and for each t0,wtis zero-mean, isotropic,
with support in the ball B(0,3σ). Hence, wtis σ2-sub-
gaussian. Further define W, a convex polytope providing a
conservative approximation of B(0,3σ), i.e., B(0,3σ)⊂ W.
The system also needs to obey the following state and
input constraints for all t0,
F xt+Gut1,(2)
where FRdc×dxand GRdc×dudefine the state and
input constraints respectively.
Assumption 3 (State and input constraints).The set
C={(x, u)Rdx×Rdu:F x +Gu 1.}
is compact and contains the origin in its interior.
Finally, we assume that we have access to a stabilizer K:
Assumption 4 (Stabilizing Controller).There exists a
known, robustly stabilizing feedback gain Ksuch that A(θ)+
B(θ)Kis stable (i.e., ρ(A(θ)+B(θ)K)<1) for all θΘ0.
B. Objective and MPC
We wish to minimize the long-term cost defined as
lim supT→∞ 1
TPT1
t=0 x>
tQxt+u>
tRut, through some pos-
itive semi-definite matrices Q, R. To this aim, we use MPC,
with a receding horizon N. Specifically, at time t, given
the current system state xtand the past observations used to
derive an estimator θtof θ?, we will identify a control policy
(uk|t)k=0,...,N1minimizing the cost along a predicted sys-
tem trajectory (xk|t)k=0,...,N . We use the well-known dual
mode prediction paradigm [21] with the following predicted
control sequence,
uk|t=(Kxk|t+vk|tk∈ {0, . . . , N 1},
Kxk|tkN, (3)
where {v0|t,...vN1|t}are the optimization variables to be
determined by the MPC. The resulting prediction dynamics
will be for k∈ {0, . . . , N 1},
x0|t=xt,(4a)
xk+1|t= Φ(θt)xk|t+B(θt)vk|t,(4b)
where Φ(θ) := A(θ) + B(θ)Kfor any θΘ0.
C. General approach
To handle the uncertainty due to both the noise and the fact
that θ?is unknown, we apply a tube-based MPC approach.
The tube used in step tis essentially constructed from a
polytope approximating a ball centered at the LSE ˆ
θtof
θ?and whose radius corresponds to a prescribed level of
confidence δthat the user wishes to guarantee. Persistent
excitation is achieved by adding (bounded) noise to the input.
The resulting algorithm is presented in Algorithm 1, and its
ingredients are detailed in the next section. Its analysis is
given in Section V.
摘要:

Self-TuningTube-basedModelPredictiveControlDamianosTranos,AlessioRusso,andAlexandreProutiereAbstract—WepresentSelf-TuningTube-basedModelPredic-tiveControl(STT-MPC),anadaptiverobustcontrolalgorithmforuncertainlinearsystemswithadditivedisturbancesbasedontheleast-squaresestimatorandpolytopictubes.Oural...

展开>> 收起<<
Self-Tuning Tube-based Model Predictive Control Damianos Tranos Alessio Russo and Alexandre Proutiere Abstract We present Self-Tuning Tube-based Model Predic-.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:580.94KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注