Optimal Weight Adaptation of Model Predictive Control for Connected and Automated Vehicles in Mixed Trafﬁc with Bayesian Optimization Viet-Anh Le IEEE Student Member Andreas A. Malikopoulos IEEE Senior Member

2025-04-29 0 0 443.95KB 6 页 10玖币

侵权投诉

Optimal Weight Adaptation of Model Predictive Control for Connected

and Automated Vehicles in Mixed Trafﬁc with Bayesian Optimization

Viet-Anh Le, IEEE Student Member, Andreas A. Malikopoulos, IEEE Senior Member

Abstract— In this paper, we develop an optimal weight adap-

tation strategy of model predictive control (MPC) for connected

and automated vehicles (CAVs) in mixed trafﬁc. We model

the interaction between a CAV and a human-driven vehicle

(HDV) as a simultaneous game and formulate a game-theoretic

MPC problem to ﬁnd a Nash equilibrium of the game. In the

MPC problem, the weights in the HDV’s objective function can

be learned online using moving horizon inverse reinforcement

learning. Using Bayesian optimization, we propose a strategy

to optimally adapt the weights in the CAV’s objective function

so that the expected true cost when using MPC in simulations

can be minimized. We validate the effectiveness of the optimal

strategy by numerical simulations of a vehicle crossing example

at an unsignalized intersection.

I. INTRODUCTION

Recent advancements in connected and automated vehicles

(CAVs) provide a promising chance in reducing both energy

consumption and travel delay [1], [2]. In our previous work

[3]–[5], we addressed coordination and routing problems for

CAVs given full penetration of CAVs. However, CAVs will

gradually penetrate the market and co-exist with human-

driven vehicles (HDVs) in the next decades. Therefore,

addressing safe and efﬁcient motion planning and control for

CAVs in mixed trafﬁc given various human driving styles is

highly important. Several control approaches have been pro-

posed in the literature such as model predictive control [6],

[7], learning-based control [8], [9], game-theoretic control

[10], and socially-compatible control [11], [12].

Among those control approaches, model predictive control

(MPC) has received signiﬁcant attention since (1) it can be

integrated into other methods such as learning-based control

or socially-compatible control, and (2) it can handle multiple

objectives and constraints concurrently. However, like in

many MPC approaches for dynamical systems, some objec-

tives, constraints, or system dynamics in motion planning

and control for CAVs are usually simpliﬁed or approximated

so that the resulting MPC problems can be solved in real-

time. In addition, the objective function in MPC is generally

formed by a linear combination of multiple features, in which

the weights are chosen empirically. As a result, true cost

optimization might not be achieved leading to performance

degradation if the weights are chosen inappropriately. An ef-

ﬁcient technique to overcome these difﬁculties in practice is

automatic weight tuning [13] which aims to derive a strategy

This work was supported by NSF under Grants CNS-2149520 and

CMMI-2219761.

The authors are with the Department of Mechanical Engineer-

ing, University of Delaware, Newark, DE 19716 USA. E-mail:

vietale@udel.edu, andreas@udel.edu.

to tune the weights of MPC so that the best true cost can

be achieved. Marco et al. [14] used Bayesian optimization

to optimize weights of a cost function to compensate for

the discrepancy between the true dynamics and a linearized

model. Gros and Zanon [15] utilized reinforcement learning

for parameter adaptation in nonlinear MPC. Jain et al. [16]

focused on ﬁnding an MPC rollout having a low true cost

using covariance matrix adaptation evolution strategy.

Furthermore, in the control applications involving human

decisions, e.g., CAVs interacting with HDVs in mixed trafﬁc,

the controller must address the stochasticity and diversity

caused by human behavior. Generally, MPC with ﬁxed

weights cannot guarantee to work well in such applications.

For example, overly weighting toward the safety objective in

the MPC design while encountering a driving scenario with

a conservative HDV may cause trafﬁc delay. In contrast, if

CAVs and HDVs behave aggressively then unsafe situations

may occur. Therefore, the weights of the MPC problem need

to be adapted online depending on the human driving model.

In the recent research effort [17], we developed a control

framework to address the motion planning problem for CAVs

in mixed trafﬁc. We modeled the interaction between a CAV

and an HDV as a simultaneous game and proposed an MPC

objective function to ﬁnd a Nash equilibrium of the game.

The weights in the objective function are parameterized by

social value orientation (SVO), and depending on the online

estimate of the SVO for the HDV, the MPC weights are

adapted heuristically. In this paper, we propose a method for

optimal weight adaptation of MPC for CAVs in mixed trafﬁc

based on Bayesian optimization. Using the proposed method,

we can derive ofﬂine the optimal weight adaptation strategy

for the MPC with respect to the HDV’s objective weights

so that the true desired performance can be achieved. Then

by learning the objective weights that best describe human

driving behavior online using real-time data and the moving

horizon inverse reinforcement learning (IRL) technique [18],

the MPC weights are adapted accordingly. We demonstrate

the proposed method by a vehicle crossing example at an

unsignalized intersection, and show the beneﬁts by compar-

ing with the heuristic method in [17].

The remainder of this paper is structured as follows.

Section II presents the game-theoretic MPC formulation and

the moving horizon IRL technique. In Section III, we develop

the method to derive the optimal weight adaptation strategy

with Bayesian optimization. In Section IV, we demonstrate

the proposed framework by an intersection crossing example,

while numerical simulation results are provided in Section V.

Finally, we conclude the paper in Section VI.

arXiv:2210.00700v2 [eess.SY] 10 Mar 2023

II. MOTION PLANNING FOR CAVS IN MIXED TRAFFIC

WITH MODEL PREDICTIVE CONTROL

In this section, we present a game-theoretic MPC formu-

lation for motion planning of a CAV while interacting with

an HDV along with the moving horizon IRL technique to

learn the objective weights of the HDV from real-time data.

A. Model Predictive Control for Motion Planning

We consider an interactive driving scenario including a

CAV and an HDV whose indices are 1and 2, respectively.

The goal of the MPC motion planner is to generate the

trajectory and control actions of CAV–1while considering

the real-time driving behavior of HDV–2. To guarantee that

CAV–1has data of HDV–2’s real-time trajectories, we make

the following assumption:

Assumption 1: A coordinator is available to collect trajec-

tories of HDV–2and transmit them to CAV–1without any

signiﬁcant delay or error during the communication.

We formulate the problem in the discrete-time domain, in

which the dynamic model of each vehicle iis given by

xi,k+1 =fi(xi,k ,ui,k),(1)

where xi,k and ui,k,i= 1,2, are the vectors of states

and control actions, respectively, at time step k∈N. We

utilize the control framework presented in [17], in which

the interaction between CAV–1and HDV–2is modeled

as a simultaneous game, i.e., the game without a leader-

follower structure, in which the objective of each vehicle

includes its individual objective and a shared objective.

Let l1x1,k+1,u1,k )and l2x2,k+1,u2,k )be the individual

objective functions of CAV–1and HDV–2, respectively, and

l12x12,k+1,u12,k , where x12,k+1 = [x>

1,k+1,x>

2,k+1]>and

u12,k = [u>

1,k,u>

2,k]>, be the cooperative term at time

step k. We assume that CAV–1and HDV–2share the

same cooperative objective, e.g., collision avoidance. Those

objective functions are usually designed as weighted sums

of some features as follows

lixi,k+1,ui,k ) = ω>

iφixi,k+1,ui,k ), i = 1,2,(2)

l12x12,k+1,u12,k ) = ω>

12φ12x12,k+1,u12,k ),(3)

where φi,φ12 are vectors of features and ωi∈ Wi,ω12 ∈

W12 are corresponding vectors of weights, where Wiand

W12 are the sets of feasible values. For ease of notation,

we deﬁne −ifor each i∈ {1,2}as the other vehicle than

vehicle i. We consider that given any control actions u−i,k

of the other vehicle, each vehicle iapplies the control actions

u∗

i,k that minimizes a sum of its individual objective and the

shared objective, i.e.,

u∗

i,k =arg min

ui,k

lixi,k+1,ui,k )+l12x12,k+1,u12,k),∀u−i,k.

(4)

Next, we formulate an MPC problem with a control

horizon of length H∈N. Let tbe the current time step and

It={t, . . . , t +H−1}be the set of all time steps in the

control horizon at time step t. We can recast the simultaneous

game between CAV–1and HDV–2presented above as a

potential game [19], the game in which all players minimize

a single global function called the potential function. In

the potential game, a Nash equilibrium can be found by

minimizing the potential function. The potential function in

this game at each time step kis

lpotx12,k+1,u12,k )

i=1,2

lixi,k+1,ui,k ) + l12x12,k+1,u12,k )(5)

Therefore, we propose utilizing the cumulative sum of the

potential function over the control horizon as the objective

function in the MPC problem, which can be given by

JMPC =X

k∈It

lpot,kx12,k+1,u12,k).(6)

Hence, the MPC problem for motion planning of CAV–1

is formulated as follows

minimize

{u12,k }k∈It

JMPC (7a)

subject to:

(1), i = 1,2,(7b)

gj(x12,k+1,u12,k )≤0,∀j∈ Jieq,(7c)

hj(x12,k+1,u12,k )=0,∀j∈ Jeq,(7d)

where (7b)–(7d) hold for all k∈ It. The constraints (7c) and

(7d) are inequality and equality constraints with Jieq and Jeq

are sets of indices.

In the objective function of the MPC problem (7), assume

that we can pre-deﬁne the features φi, i = 1,2and φ12,

if we learn online ω2and ω12 that best describe the human

driving behavior, the CAV’s objective weights ω1are adapted

to achieve the desired performance. The optimal strategy

for adapting ω1can be derived ofﬂine using Bayesian

optimization as presented in Section III.

B. Moving Horizon Inverse Reinforcement Learning

To identify the weights ω2and ω12 in the individual

objective function of HDV–2and the shared objective, we

utilize the feature-based IRL approach [18], [20], a ma-

chine learning technique developed to learn the underlying

objective or reward of an agent by observing its behavior.

We deﬁne the vector of all features and the vector of all

corresponding weights in HDV–2’s objective function as

f= [φ>

2,φ>

12]>and θ= [ω>

2,ω>

12]>, respectively. Let ˜

be the vector of average observed feature values computed

from data and Ep[f]be the expected feature values with

a given probability distribution pover trajectories. With

feature-based IRL, the goal is to learn the weight vector

θ∈Ω, where Ω = W2× W12 so that expected feature

values can match observed feature values.

In moving horizon IRL, at each time step, we utilize

the L∈Nmost recent trajectory segments to update the

weight estimate, where Lis the estimation horizon length.

Let tbe the current time step and Rt={rm}m=1,...,L

be the set of Lsample trajectory segments collected

over the estimation horizon at time t, in which rm=

(x12,t−m,x12,t−m+1,u12,t−m), for m= 1, . . . , L, is the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OptimalWeightAdaptationofModelPredictiveControlforConnectedandAutomatedVehiclesinMixedTrafcwithBayesianOptimizationViet-AnhLe,IEEEStudentMember,AndreasA.Malikopoulos,IEEESeniorMemberAbstractInthispaper,wedevelopanoptimalweightadap-tationstrategyofmodelpredictivecontrol(MPC)forconnectedandautomated...

展开>> 收起<<

Optimal Weight Adaptation of Model Predictive Control for Connected and Automated Vehicles in Mixed Trafﬁc with Bayesian Optimization Viet-Anh Le IEEE Student Member Andreas A. Malikopoulos IEEE Senior Member.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Optimal Weight Adaptation of Model Predictive Control for Connected and Automated Vehicles in Mixed Trafﬁc with Bayesian Optimization Viet-Anh Le IEEE Student Member Andreas A. Malikopoulos IEEE Senior Member

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: