Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States Lin Song1 Pan Zhao1 Neng Wan1 and Naira Hovakimyan1

2025-05-03 0 0 658.35KB 9 页 10玖币

侵权投诉

Safety Embedded Stochastic Optimal Control of Networked

Multi-Agent Systems via Barrier States

Lin Song1, Pan Zhao1, Neng Wan1, and Naira Hovakimyan1

Abstract— This paper presents a novel approach for achiev-

ing safe stochastic optimal control in networked multi-agent

systems (MASs). The proposed method incorporates barrier

states (BaSs) into the system dynamics to embed safety con-

straints. To accomplish this, the networked MAS is factorized

into multiple subsystems, and each one is augmented with BaSs

for the central agent. The optimal control law is obtained

by solving the joint Hamilton-Jacobi-Bellman (HJB) equation

on the augmented subsystem, which guarantees safety via the

boundedness of the BaSs. The BaS-based optimal control tech-

nique yields safe control actions while maintaining optimality.

The safe optimal control solution is approximated using path

integrals. To validate the effectiveness of the proposed approach,

numerical simulations are conducted on a cooperative UAV

team in two different scenarios.

I. INTRODUCTION

Optimal control has achieved remarkable success in both

theory and applications [1], [2]. Obtaining optimal control

usually requires solving a nonlinear, second-order partial

differential equation (PDE), known as Hamilton-Jacobi-

Bellman (HJB) equation. Stochastic optimal control (SOC)

problems involve solving the control problem by minimizing

expected costs [3]. By applying an exponential transfor-

mation to the value function [4], a linear-form HJB PDE

is obtained, enabling related research including linearly-

solvable optimal control (LSOC) [5] and path-integral control

(PIC) [3], [6]. The beneﬁts of LSOC problems include

compositionality [7], [8] and the path-integral representation

of the optimal control solution. However, solving SOC

problems in large-scale systems is challenging due to the

curse of dimensionality [9]. To overcome computational

challenges, many approximation-based approaches have been

developed, such as path-integral (PI) formulation [10], value

function approximation [11], and policy approximation [12].

In [13], a PI approach is used to approximate optimal control

actions on multi-agent systems (MASs), and the optimal path

distribution is predicted using the graphical model inference

approach. A distributed PIC algorithm is proposed in [14],

in which a networked MAS is partitioned into multiple

subsystems, and local optimal control actions are determined

using local observations. However, these approaches seldom

consider safety in the problem formulation, which may limit

their real-world applications.

*This work is supported by Air Force Ofﬁce of Scientiﬁc Research (AF-

SOR) (award #FA9550-21-1-0411) and National Aeronautics and Space Ad-

ministration (NASA) (awards #80NSSC22M0070 and #80NSSC17M0051).

1Lin Song, Pan Zhao, Neng Wan, and Naira Hovakimyan are with the

Department of Mechanical Science and Engineering, University of Illinois at

Urbana-Champaign, Urbana, IL 61801 USA {linsong2, panzhao2,

nengwan2, nhovakim}@illinois.edu

Safety refers to ensuring that a system’s states remain

within appropriate regions at all times for deterministic

systems, or with a high probability for stochastic systems.

Reachability analysis is a formal veriﬁcation approach used

to prove safety and performance guarantees for dynamical

systems [15], [16]. Hamilton-Jacobi (HJ) reachability anal-

ysis identiﬁes the initial states that the system needs to

avoid as well as the associated optimal control for the sake

of remaining safe [17]. However, computing the reachable

set in reachability analysis is typically expensive, making

it challenging to apply to multi-agent and high-dimensional

systems. To enable safe optimal control, safety metrics

can be incorporated into the optimal control framework,

either as objectives or constraints. In [18], temporal logic

speciﬁcations are used as constraints for safety enforcement

in optimal control development. The control barrier function

(CBF) is a potent tool that can be used to enforce system

safety by solving optimal control with constraints in a min-

imally invasive fashion [19]. CBF-based methods have also

been extended to stochastic systems with high-probability

guarantees [20]–[22]. A multi-agent CBF framework that

generates collision-free controllers is discussed in [23], [24].

Furthermore, guaranteed safety-constraint satisfaction in the

network system is achieved in [25] under a valid assume-

guarantee contract, with CBFs implemented onto subsys-

tems. However, implementing CBFs as safety ﬁlters into

the optimal control inputs may hinder ultimate optimality

and be typically reactive to given constraints. Additionally,

the feasibility of the quadratic programming (QP) introduced

by CBF-based methods was not always guaranteed until the

recent work in [26]. The barrier state (BaS) method is a novel

methodology studied in [27], where the stability analysis

of a BaS-augmented system encodes both stabilization and

safety of the original system, and thus potential conﬂicts be-

tween control objectives and safety enforcement are avoided.

In [28], discrete BaS (DBaS) is employed with differential

dynamic programming (DDP) in trajectory optimization, and

it has been shown that bounded DBaS implies the generation

of safe trajectories. The DBaSs have also been integrated into

importance sampling to improve sample efﬁciency in safety-

constrained sampling-based control problems in [29].

Compared to CBF-based methods that solve constrained

optimization problems to determine certiﬁed-safe control

actions, BaS-based safe control formulates the problem

without explicit constraints; the safety notion is embedded

in the solution boundedness, which prevents potential con-

ﬂicts between control performance and safety requirements.

However, the methodology of addressing safety issues with-

arXiv:2210.03855v2 [eess.SY] 3 Apr 2023

out sacriﬁcing optimality in networked MASs remains an

open problem. In this paper, we propose a safety-embedded

SOC framework for networked MASs using BaSs proposed

in [27]. We adopt the MAS framework considered in [30],

[31], where each agent computes optimal control based on

local observations. However, [31] does not consider system

safety, while [30] formulates safety concerns in the CBF

framework and is potentially subject to the aforementioned

issues. To address the safety-guarantee deﬁciency in optimal

controls, we augment the dynamics of the central agent in

each subsystem with BaSs that embed safety constraints and

formulate the optimal control problem using the augmented

dynamics. Bounded solutions to the revised optimal control

problem then ensure safety due to the characteristics of BaSs.

The rest of the paper is structured as follows: Section II in-

troduces the preliminaries of formulating SOC problems and

constructing BaS; Section III presents the safety-embedded

SOC framework on MASs, along with the path integral for-

mulation to approximate the control solution; and Section IV

provides numerical simulations in two scenarios to validate

the effectiveness of the proposed approach. Finally, section V

concludes the paper and discusses future research directions.

Several notations used in this paper are deﬁned as follows:

We use |S| denotes the cardinality of set S,det(X)denotes

the determinant of matrix X, tr(X)denotes the trace of

matrix X,∇xVand ∇2

xxVrefer to the gradient and Hessian

matrix of scalar-valued function V, and kvk2

M:= v>Mv

denotes the weighted square norm.

II. PRELIMINARIES AND PROBLEM

FORMULATION

A. Stochastic optimal control problems

1) MASs and factorial subsystems: We consider a MAS

with Nhomogeneous agents indexed by {1,2, . . . , N}. To

describe the networked MAS, we use a connected and undi-

rected graph G={V,E}, where vertex vi∈ V represents

agent i, and undirected edge (vi, vj)∈ E indicates that

agent iand jcan communicate with each other. We deﬁne

the index set of all agents neighboring agent ias Ni,

and factorize the networked MAS into multiple subsystems

Ni=Ni∪ {i}, where each factorial subsystem consists

of a central agent and all its neighboring agents. Figure 1

provides an illustrative example of the factorization scheme,

where ¯xiand ¯uidenote the joint states and joint control

actions of factorial subsystem ¯

Ni. The local control action

Fig. 1: MAS Gand factorial subsystems ¯

N1and ¯

N3.

ujis determined by minimizing a joint cost function on

subsystem ¯

Nj, which depends on the local observation

¯xj. Computing optimal control actions and sampling are

therefore related to the size of each subsystem, rather than

the entire network, which reduces computational complexity.

For more discussions on the distributed control for LSOC

problems on MASs, interested readers can refer to [14].

2) Stochastic optimal control of MASs: We use the Itˆ

diffusion process to describe the joint dynamics of subsystem

Niin a networked MAS consisting of Nhomogeneous

agents governed by mutually independent passive dynamics.

The process is represented by the following equation:

d¯xi= ¯gi(¯xi, t)dt +¯

Bi(¯xi)[¯ui(¯xi, t)dt + ¯σid¯ωi],(1)

where ¯xi= [x>

i, x>

j∈Ni]>∈RM·| ¯

Ni|is the joint state

vector and Mrepresents the state dimension of each in-

dividual agent, ¯gi(¯xi, t)=[gi(xi, t)>, gj∈Ni(xj, t)>]>∈

RM·| ¯

Ni|represents the joint passive dynamics, which in-

cludes the passive dynamics of the individual agent iand

its neighbors j∈ Ni.¯

Bi(¯xi) = diag{Bi(xi), Bj∈Ni(xj)} ∈

RM·| ¯

Ni|×P·| ¯

Ni|is the joint control matrix, ¯ui(¯xi, t) =

[ui(¯xi, t)>, uj∈Ni(¯xi, t)>]>∈RP·| ¯

Ni|is the joint control

action vector, and ¯ωi= [ω>

i, ω>

j∈Ni]>∈RP·| ¯

Ni|is the joint

noise vector with covariance matrix ¯σi=diag{σi, σj∈Ni} ∈

RP·| ¯

Ni|×P·| ¯

Ni|. To ensure the uniqueness of the solution, we

assume that ¯gi,¯

Bi,¯σiare locally Lipschitz continuous.

We use ¯

Bito denote the set of joint terminal states and

Iito denote the set of joint non-terminal states. The entire

allowable joint state space ¯

Siis partitioned into ¯

Iiand ¯

Bi.

For ¯xi∈¯

Ii, we deﬁne the running cost function as

ci(¯xi,¯ui) = qi(¯xi) + 1

2¯ui(¯xi, t)>¯

Ri¯ui(¯xi, t),

where qi(¯xi)∈R≥0is a joint state cost, and

¯ui(¯xi, t)>¯

Ri¯ui(¯xi, t)is a control-quadratic cost with positive

deﬁnite matrix ¯

Ri∈RP·| ¯

Ni|×P·| ¯

Ni|. When ¯xtf

i∈¯

Bi, the

terminal cost function is denoted by φi(¯xtf

i), where tfis the

ﬁnal time. We also have the terminal cost function φi(xtf

deﬁned for xtf

i∈ Bi. In the ﬁrst exit formulation, tfis

determined online as the ﬁrst time a joint state ¯x∈¯

is reached. The cost-to-go function J¯ui(¯xt

i, t)under joint

control action ¯uiis deﬁned as

J¯ui(¯xt

i, t) = E¯ui

¯xt

i,t[φi(¯xtf

i) + Ztf

ci(¯xi(τ),¯ui(τ))dτ ],(2)

where the expectation is taken with respect to the probability

measure under which ¯xisatisﬁes (1) under given joint control

¯uistarting from the initial condition ¯xt

i. The optimal cost-

to-go function (or value function) is formulated as

Vi(¯xt

i, t) = min

¯ui

J¯ui(¯xt

i, t),

which is the minimum expected cumulative running cost

starting from joint state ¯xt

i. For the sake of brevity, we use

the notation ¯xito represent ¯xi(t)and ¯xt

iin the following

context.

Facilitated by the exponential transformation of the value

function, the optimal control action for the stochastic sys-

tem (1) can be expressed in a linear form. The linear-

form optimal control solution was initially proposed for a

single-agent system in [32], and later extended to a multi-

agent scenario in [31]. Here, we summarize the main results.

The desirability function Z(¯xi, t) = exp[−Vi(¯xi, t)/λi]is

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SafetyEmbeddedStochasticOptimalControlofNetworkedMulti-AgentSystemsviaBarrierStatesLinSong1,PanZhao1,NengWan1,andNairaHovakimyan1AbstractThispaperpresentsanovelapproachforachiev-ingsafestochasticoptimalcontrolinnetworkedmulti-agentsystems(MASs).Theproposedmethodincorporatesbarrierstates(BaSs)intoth...

展开>> 收起<<

Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States Lin Song1 Pan Zhao1 Neng Wan1 and Naira Hovakimyan1.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Safety Embedded Stochastic Optimal Control of Networked Multi-Agent Systems via Barrier States Lin Song1 Pan Zhao1 Neng Wan1 and Naira Hovakimyan1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: