Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions Yoonjae Lee Efstathios Bakolas

2025-05-06 1 0 418.71KB 6 页 10玖币

侵权投诉

Two-Player Reconnaissance Game with Half-Planar Target and Retreat

Regions

Yoonjae Lee Efstathios Bakolas

Abstract— This paper is concerned with the reconnaissance

game that involves two mobile agents: the Intruder and the

Defender. The Intruder is tasked to reconnoiter a territory of

interest (target region) and then return to a safe zone (retreat

region), where the two regions are disjoint half-planes, while

being chased by the faster Defender. This paper focuses on the

scenario where the Defender is not guaranteed to capture the

Intruder before the latter agent reaches the retreat region. The

goal of the Intruder is to minimize its distance to the target

region, whereas the Defender’s goal is to maximize the same

distance. The game is decomposed into two phases based on

the Intruder’s myopic goal. The complete solution of the game

corresponding to each phase, namely the Value function and

state-feedback equilibrium strategies, is developed in closed-

form using differential game methods. Numerical simulation

results are presented to showcase the efﬁcacy of our solutions.

I. INTRODUCTION

Pursuit-evasion and target defense games have received a

signiﬁcant amount of attention due to their close connection

with a wide range of applications in, for example, aerospace,

military, and robotics. Not many attempts, however, have

yet been made to address hybrid problems that possess

both pursuit-evasion and target defense aspects, such as

aerial reconnaissance and coast guarding problems. In these

problems, the goal of an intruding agent is not necessarily to

attack or reach a target, but to perform reconnaissance tasks

in the vicinity of it and then escape to a safe zone before

being neutralized by its opponent. This paper presents how to

formulate this type of problems as a two-staged differential

game and ﬁnd equilibrium (i.e., worst-case) control policies

for the agents involved.

Literature review: The study of pursuit-evasion games was

initiated by Isaacs in his seminal work [1]. Recent advances

in the ﬁeld of pursuit-evasion games are summarized in [2].

One remarkable variant of pursuit-evasion games is the so-

called target guarding game, the two-player version of which

was ﬁrst discussed in [1]. For the past few years, target

guarding games with a stationary target or territory have

been extensively studied. The related work in the literature

can be categorized based on the shape of the target under

consideration, which can be a point [3], [4], a line [5]–

[7], a circle [8], or an arbitrary convex set [9]–[12]. One

class of target guarding games that has recently attracted

the attention of many researchers is the so-called perimeter-

defense game [13], [14], which involves a defender(s) whose

state is constrained along the boundary or perimeter of a

target. The turret-defense game, which is a special case

that has a circular target, has recently been studied in [15].

Y. Lee (PhD student) and E. Bakolas (Associate Professor) are with the

Department of Aerospace Engineering and Engineering Mechanics, The

University of Texas at Austin, Austin, Texas 78712-1221, USA, Emails:

yol033@utexas.edu; bakolas@austin.utexas.edu

Another remarkable class of target guarding games is the

active target defense game, which involves a mobile target

[16] or a maneuverable target (i.e., an evader) [17]–[19].

In target guarding games, an intruder aims at reaching a

target or, if capture is unavoidable, approach the target as

close as possible until the time of capture. This problem

formulation is relevant to the problem considered in this

paper, namely the reconnaissance game. Similar to target

guarding games, the reconnaissance game involves an in-

truder whose goal is to minimize its distance to the target.

The difference, however, is that the intruder in the latter game

has an additional goal which is to enter a safe zone before

being captured by the defender. The fact that the intruder

has dual goals makes the game difﬁcult to analyze with the

classical differential game approach. To our best knowledge,

[20] was the ﬁrst work that studied an open-loop solution of

the reconnaissance game with two agents and a point target

using the method of game decomposition. This method has

later been revisited for many other problems that have more

than one stages, such as the ﬁxed-course target observation

problem [21], the engage or retreat game [22], and the

capture-the-ﬂag game [23], [24]. Note that the key difference

between the capture-the-ﬂag game and the reconnaissance

game is that the latter game always terminates with the

intruder entering the safe zone, whereas the former game

has no such hard terminal constraint.

Statement of contributions: The main contributions of this

paper are as follows. Compared to [20] in which only the

point target case was considered and an open-loop solution

was developed in part based on iterative search methods,

we develop the complete solution for the reconnaissance

game involving two agents and half-planar target and retreat

regions completely analytically. In particular, our method

yields the solution to the game, namely the Value function

and state-feedback equilibrium strategies of the game, in

closed-form instead of relying on algorithmic or numerical

methods as in [20]. Furthermore, we rigorously verify the

validity of our solution using the Hamilton-Jacobi-Isaacs

(HJI) equation, a step that has also been absent in [20].

Lastly, using the analytical expression of the Value function,

we show how to divide the state space of the game into the

winning sets of each agent and also how to characterize the

barriers that demarcate these winning sets.

Outline: The rest of the paper is structured as follows.

In Section II, the reconnaissance game is formulated and

decomposed into two phases. In Sections III and IV, the

solution of the game corresponding to each phase is de-

veloped based on differential game methods. In Section

V, simulation results are presented. Finally, in Section VI,

concluding remarks are provided.

arXiv:2210.01364v2 [eess.SY] 21 Mar 2023

II. PROBLEM FORMULATION

In this section, the two-player reconnaissance game that

takes place in R2is formulated. The game involves two

mobile agents, the Intruder (I) and the Defender (D), whose

equations of motion are given as

˙xI= cos φ, xI(0) = x0

I,(1)

˙yI= sin φ, yI(0) = y0

I,(2)

˙xD=αcos ψ, xD(0) = x0

D,(3)

˙yD=αsin ψ, yD(0) = y0

D,(4)

where [xI, yI]>∈R2,[x0

I, y0

I]>∈R2\(T ∪ R), and φ∈

[−π, π](resp., [xD, yD]>∈R2,[x0

D, y0

D]>∈R2\R, and

ψ∈[−π, π]) denote the state/position, initial state/position,

and control input/heading angle of the Intruder (resp., De-

fender), respectively; note that the sets Tand Rwill be

deﬁned shortly. The scalar αdenotes the speed of the

Defender or the speed ratio, where α > 1(in other words,

the Defender is faster than the Intruder). The dynamics of

the game can be written as

x=f(x, φ, ψ),x(0) = x0,(5)

where x= [xI, yI, xD, yD]>∈R4is the game state, x0=

[x0

I, y0

I, x0

D, y0

D]>∈R4is the initial game state, and f:R4×

[−π, π]×[−π, π]→R4is the (continuously differentiable)

vector ﬁeld of the game dynamics.

In this game, the Intruder is interested in reconnoitering

(i.e., minimizing its distance to) the target region T:

T=[x, y]>∈R2:y≥l,(6)

where l > 0. After performing the reconnaissance task, the

same agent is required to return to the retreat region R:

R=[x, y]>∈R2:y≤0.(7)

By construction, Tand Rare closed half-spaces in R2(i.e.,

half-planes) that are disjoint. The Intruder is said to win the

game if it enters R, which is equivalent to say, if the game

state reaches the terminal manifold Tr:

Tr=x∈R4:yI≤0.(8)

The Defender, on the other hand, is considered to win if it

captures (with zero capture radius) the Intruder outside R.

The terminal manifold for capture, Tc, is deﬁned as

Tc=x∈R4\Tr: (xI−xD)2+ (yI−yD)2= 0.(9)

Note from (8) and (9) that the Intruder is considered to win

if it is “captured” on the boundary of R.

Since the game of our interest requires that the Intruder

return to Rbefore capture occurs, we will focus on the

scenario where the game state ends up in Tr. In this case,

the objective of the Intruder is to minimize its distance to T

(at some time instant during the game) as much as possible

in the presence of the Defender that strives to maximize the

same distance. The payoff functional of the reconnaissance

game can thus be deﬁned by

J= min

t∈[0,tf]dist([xI(t), yI(t)]>,T),(10)

where tf= inf {t≥0 : x∈Tr}is the ﬁnal time and the

function dist : R2×2R2→[0,∞)measures the (minimum)

(a) Phase I (Approach Phase)

(b) Phase II (Retreat Phase)

Fig. 1: Illustration of the two phases of the reconnaissance game.

During Phase I, the Intruder (red triangle) approaches the target

region T(gray half-plane) to minimize its distance from the latter

region as much as possible in the presence of the Defender (teal

circle). When the Intruder realizes it should no longer proceed or

otherwise capture may occur before entering the retreat region R

(purple half-plane), the game transitions to Phase II, in which the

Intruder’s goal is to arrive at the point in Rwhere the distance

between the two agents is at its maximum. Note that the resulting

payoff of the illustrated case is d.

distance between a point and a subset of R2. The Value of

the game is

V(x0) = min

φ(·)max

ψ(·)J, (11)

where φ(·)and ψ(·)denote the state-feedback strategies of

each agent.

The goal of this paper is to ﬁnd 1) the winning regions of

each agent and 2) the Value function of the game, V, and the

corresponding state-feedback equilibrium strategies, φ?(·)

and ψ?(·). By deﬁnition, the pair of equilibrium strategies

must satisfy the saddle-point condition

J(φ?(·), ψ(·); x0)≤J(φ?(·), ψ?(·); x0)

≤J(φ(·), ψ?(·); x0),(12)

for all possible φ(·)and ψ(·).

Due to the non-integral payoff functional in (10), it is

difﬁcult to analyze the game in its current form. For this rea-

son, as suggested in [20], we will decompose the game into

two phases based on the Intruder’s myopic goal: approach or

retreat. See Figure 1 for the illustration of Phase I (approach

phase) and Phase II (retreat phase). In the following sections,

we will solve Phase II ﬁrst and then Phase I, as the solution

of Phase I depends partially on that of Phase II.

III. PHASE II: RETREAT PHASE

Suppose at some time ts∈[0, tf], the Intruder has

arrived at the point closest to T. Simultaneously, the game

transitions from Phase I to Phase II. In the latter phase, in

order to reduce the chance of capture, the Intruder desires

to maximize its distance from the Defender at the moment

it reaches a point in R, which we refer to as a retreat point.

Conversely, the Defender attempts to increase the chance

of capture by minimizing the same distance. The payoff

functional of this phase is therefore deﬁned as

JII =q(xf

I−xf

D)2+ (yf

I−yf

D)2,(13)

where xf= [xf

I, yf

I, xf

D, yf

D]>is the game state at t=tf.

Note that the game deﬁned by the payoff (13) and terminal

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Two-PlayerReconnaissanceGamewithHalf-PlanarTargetandRetreatRegionsYoonjaeLeeEfstathiosBakolasAbstractThispaperisconcernedwiththereconnaissancegamethatinvolvestwomobileagents:theIntruderandtheDefender.TheIntruderistaskedtoreconnoiteraterritoryofinterest(targetregion)andthenreturntoasafezone(retreatr...

展开>> 收起<<

Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions Yoonjae Lee Efstathios Bakolas.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions Yoonjae Lee Efstathios Bakolas

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: