Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States Lin Song1 Pan Zhao1 Neng Wan1 and Naira Hovakimyan1

2025-05-03 0 0 658.35KB 9 页 10玖币
侵权投诉
Safety Embedded Stochastic Optimal Control of Networked
Multi-Agent Systems via Barrier States
Lin Song1, Pan Zhao1, Neng Wan1, and Naira Hovakimyan1
Abstract This paper presents a novel approach for achiev-
ing safe stochastic optimal control in networked multi-agent
systems (MASs). The proposed method incorporates barrier
states (BaSs) into the system dynamics to embed safety con-
straints. To accomplish this, the networked MAS is factorized
into multiple subsystems, and each one is augmented with BaSs
for the central agent. The optimal control law is obtained
by solving the joint Hamilton-Jacobi-Bellman (HJB) equation
on the augmented subsystem, which guarantees safety via the
boundedness of the BaSs. The BaS-based optimal control tech-
nique yields safe control actions while maintaining optimality.
The safe optimal control solution is approximated using path
integrals. To validate the effectiveness of the proposed approach,
numerical simulations are conducted on a cooperative UAV
team in two different scenarios.
I. INTRODUCTION
Optimal control has achieved remarkable success in both
theory and applications [1], [2]. Obtaining optimal control
usually requires solving a nonlinear, second-order partial
differential equation (PDE), known as Hamilton-Jacobi-
Bellman (HJB) equation. Stochastic optimal control (SOC)
problems involve solving the control problem by minimizing
expected costs [3]. By applying an exponential transfor-
mation to the value function [4], a linear-form HJB PDE
is obtained, enabling related research including linearly-
solvable optimal control (LSOC) [5] and path-integral control
(PIC) [3], [6]. The benefits of LSOC problems include
compositionality [7], [8] and the path-integral representation
of the optimal control solution. However, solving SOC
problems in large-scale systems is challenging due to the
curse of dimensionality [9]. To overcome computational
challenges, many approximation-based approaches have been
developed, such as path-integral (PI) formulation [10], value
function approximation [11], and policy approximation [12].
In [13], a PI approach is used to approximate optimal control
actions on multi-agent systems (MASs), and the optimal path
distribution is predicted using the graphical model inference
approach. A distributed PIC algorithm is proposed in [14],
in which a networked MAS is partitioned into multiple
subsystems, and local optimal control actions are determined
using local observations. However, these approaches seldom
consider safety in the problem formulation, which may limit
their real-world applications.
*This work is supported by Air Force Office of Scientific Research (AF-
SOR) (award #FA9550-21-1-0411) and National Aeronautics and Space Ad-
ministration (NASA) (awards #80NSSC22M0070 and #80NSSC17M0051).
1Lin Song, Pan Zhao, Neng Wan, and Naira Hovakimyan are with the
Department of Mechanical Science and Engineering, University of Illinois at
Urbana-Champaign, Urbana, IL 61801 USA {linsong2, panzhao2,
nengwan2, nhovakim}@illinois.edu
Safety refers to ensuring that a system’s states remain
within appropriate regions at all times for deterministic
systems, or with a high probability for stochastic systems.
Reachability analysis is a formal verification approach used
to prove safety and performance guarantees for dynamical
systems [15], [16]. Hamilton-Jacobi (HJ) reachability anal-
ysis identifies the initial states that the system needs to
avoid as well as the associated optimal control for the sake
of remaining safe [17]. However, computing the reachable
set in reachability analysis is typically expensive, making
it challenging to apply to multi-agent and high-dimensional
systems. To enable safe optimal control, safety metrics
can be incorporated into the optimal control framework,
either as objectives or constraints. In [18], temporal logic
specifications are used as constraints for safety enforcement
in optimal control development. The control barrier function
(CBF) is a potent tool that can be used to enforce system
safety by solving optimal control with constraints in a min-
imally invasive fashion [19]. CBF-based methods have also
been extended to stochastic systems with high-probability
guarantees [20]–[22]. A multi-agent CBF framework that
generates collision-free controllers is discussed in [23], [24].
Furthermore, guaranteed safety-constraint satisfaction in the
network system is achieved in [25] under a valid assume-
guarantee contract, with CBFs implemented onto subsys-
tems. However, implementing CBFs as safety filters into
the optimal control inputs may hinder ultimate optimality
and be typically reactive to given constraints. Additionally,
the feasibility of the quadratic programming (QP) introduced
by CBF-based methods was not always guaranteed until the
recent work in [26]. The barrier state (BaS) method is a novel
methodology studied in [27], where the stability analysis
of a BaS-augmented system encodes both stabilization and
safety of the original system, and thus potential conflicts be-
tween control objectives and safety enforcement are avoided.
In [28], discrete BaS (DBaS) is employed with differential
dynamic programming (DDP) in trajectory optimization, and
it has been shown that bounded DBaS implies the generation
of safe trajectories. The DBaSs have also been integrated into
importance sampling to improve sample efficiency in safety-
constrained sampling-based control problems in [29].
Compared to CBF-based methods that solve constrained
optimization problems to determine certified-safe control
actions, BaS-based safe control formulates the problem
without explicit constraints; the safety notion is embedded
in the solution boundedness, which prevents potential con-
flicts between control performance and safety requirements.
However, the methodology of addressing safety issues with-
1
arXiv:2210.03855v2 [eess.SY] 3 Apr 2023
out sacrificing optimality in networked MASs remains an
open problem. In this paper, we propose a safety-embedded
SOC framework for networked MASs using BaSs proposed
in [27]. We adopt the MAS framework considered in [30],
[31], where each agent computes optimal control based on
local observations. However, [31] does not consider system
safety, while [30] formulates safety concerns in the CBF
framework and is potentially subject to the aforementioned
issues. To address the safety-guarantee deficiency in optimal
controls, we augment the dynamics of the central agent in
each subsystem with BaSs that embed safety constraints and
formulate the optimal control problem using the augmented
dynamics. Bounded solutions to the revised optimal control
problem then ensure safety due to the characteristics of BaSs.
The rest of the paper is structured as follows: Section II in-
troduces the preliminaries of formulating SOC problems and
constructing BaS; Section III presents the safety-embedded
SOC framework on MASs, along with the path integral for-
mulation to approximate the control solution; and Section IV
provides numerical simulations in two scenarios to validate
the effectiveness of the proposed approach. Finally, section V
concludes the paper and discusses future research directions.
Several notations used in this paper are defined as follows:
We use |S| denotes the cardinality of set S,det(X)denotes
the determinant of matrix X, tr(X)denotes the trace of
matrix X,xVand 2
xxVrefer to the gradient and Hessian
matrix of scalar-valued function V, and kvk2
M:= v>Mv
denotes the weighted square norm.
II. PRELIMINARIES AND PROBLEM
FORMULATION
A. Stochastic optimal control problems
1) MASs and factorial subsystems: We consider a MAS
with Nhomogeneous agents indexed by {1,2, . . . , N}. To
describe the networked MAS, we use a connected and undi-
rected graph G={V,E}, where vertex vi∈ V represents
agent i, and undirected edge (vi, vj)∈ E indicates that
agent iand jcan communicate with each other. We define
the index set of all agents neighboring agent ias Ni,
and factorize the networked MAS into multiple subsystems
¯
Ni=Ni∪ {i}, where each factorial subsystem consists
of a central agent and all its neighboring agents. Figure 1
provides an illustrative example of the factorization scheme,
where ¯xiand ¯uidenote the joint states and joint control
actions of factorial subsystem ¯
Ni. The local control action
Fig. 1: MAS Gand factorial subsystems ¯
N1and ¯
N3.
ujis determined by minimizing a joint cost function on
subsystem ¯
Nj, which depends on the local observation
¯xj. Computing optimal control actions and sampling are
therefore related to the size of each subsystem, rather than
the entire network, which reduces computational complexity.
For more discussions on the distributed control for LSOC
problems on MASs, interested readers can refer to [14].
2) Stochastic optimal control of MASs: We use the Itˆ
o
diffusion process to describe the joint dynamics of subsystem
¯
Niin a networked MAS consisting of Nhomogeneous
agents governed by mutually independent passive dynamics.
The process is represented by the following equation:
d¯xi= ¯gi(¯xi, t)dt +¯
Bi(¯xi)[¯ui(¯xi, t)dt + ¯σid¯ωi],(1)
where ¯xi= [x>
i, x>
j∈Ni]>RM·| ¯
Ni|is the joint state
vector and Mrepresents the state dimension of each in-
dividual agent, ¯gi(¯xi, t)=[gi(xi, t)>, gj∈Ni(xj, t)>]>
RM·| ¯
Ni|represents the joint passive dynamics, which in-
cludes the passive dynamics of the individual agent iand
its neighbors j∈ Ni.¯
Bi(¯xi) = diag{Bi(xi), Bj∈Ni(xj)} ∈
RM·| ¯
NiP·| ¯
Ni|is the joint control matrix, ¯ui(¯xi, t) =
[ui(¯xi, t)>, uj∈Ni(¯xi, t)>]>RP·| ¯
Ni|is the joint control
action vector, and ¯ωi= [ω>
i, ω>
j∈Ni]>RP·| ¯
Ni|is the joint
noise vector with covariance matrix ¯σi=diag{σi, σj∈Ni} ∈
RP·| ¯
NiP·| ¯
Ni|. To ensure the uniqueness of the solution, we
assume that ¯gi,¯
Bi,¯σiare locally Lipschitz continuous.
We use ¯
Bito denote the set of joint terminal states and
¯
Iito denote the set of joint non-terminal states. The entire
allowable joint state space ¯
Siis partitioned into ¯
Iiand ¯
Bi.
For ¯xi¯
Ii, we define the running cost function as
ci(¯xi,¯ui) = qi(¯xi) + 1
2¯ui(¯xi, t)>¯
Ri¯ui(¯xi, t),
where qi(¯xi)R0is a joint state cost, and
¯ui(¯xi, t)>¯
Ri¯ui(¯xi, t)is a control-quadratic cost with positive
definite matrix ¯
RiRP·| ¯
NiP·| ¯
Ni|. When ¯xtf
i¯
Bi, the
terminal cost function is denoted by φi(¯xtf
i), where tfis the
final time. We also have the terminal cost function φi(xtf
i)
defined for xtf
i∈ Bi. In the first exit formulation, tfis
determined online as the first time a joint state ¯x¯
Bi
is reached. The cost-to-go function J¯ui(¯xt
i, t)under joint
control action ¯uiis defined as
J¯ui(¯xt
i, t) = E¯ui
¯xt
i,t[φi(¯xtf
i) + Ztf
t
ci(¯xi(τ),¯ui(τ)) ],(2)
where the expectation is taken with respect to the probability
measure under which ¯xisatisfies (1) under given joint control
¯uistarting from the initial condition ¯xt
i. The optimal cost-
to-go function (or value function) is formulated as
Vi(¯xt
i, t) = min
¯ui
J¯ui(¯xt
i, t),
which is the minimum expected cumulative running cost
starting from joint state ¯xt
i. For the sake of brevity, we use
the notation ¯xito represent ¯xi(t)and ¯xt
iin the following
context.
Facilitated by the exponential transformation of the value
function, the optimal control action for the stochastic sys-
tem (1) can be expressed in a linear form. The linear-
form optimal control solution was initially proposed for a
single-agent system in [32], and later extended to a multi-
agent scenario in [31]. Here, we summarize the main results.
The desirability function Z(¯xi, t) = exp[Vi(¯xi, t)i]is
2
摘要:

SafetyEmbeddedStochasticOptimalControlofNetworkedMulti-AgentSystemsviaBarrierStatesLinSong1,PanZhao1,NengWan1,andNairaHovakimyan1Abstract—Thispaperpresentsanovelapproachforachiev-ingsafestochasticoptimalcontrolinnetworkedmulti-agentsystems(MASs).Theproposedmethodincorporatesbarrierstates(BaSs)intoth...

展开>> 收起<<
Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States Lin Song1 Pan Zhao1 Neng Wan1 and Naira Hovakimyan1.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:658.35KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注