Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions Yoonjae Lee Efstathios Bakolas

2025-05-06 0 0 418.71KB 6 页 10玖币
侵权投诉
Two-Player Reconnaissance Game with Half-Planar Target and Retreat
Regions
Yoonjae Lee Efstathios Bakolas
Abstract This paper is concerned with the reconnaissance
game that involves two mobile agents: the Intruder and the
Defender. The Intruder is tasked to reconnoiter a territory of
interest (target region) and then return to a safe zone (retreat
region), where the two regions are disjoint half-planes, while
being chased by the faster Defender. This paper focuses on the
scenario where the Defender is not guaranteed to capture the
Intruder before the latter agent reaches the retreat region. The
goal of the Intruder is to minimize its distance to the target
region, whereas the Defender’s goal is to maximize the same
distance. The game is decomposed into two phases based on
the Intruder’s myopic goal. The complete solution of the game
corresponding to each phase, namely the Value function and
state-feedback equilibrium strategies, is developed in closed-
form using differential game methods. Numerical simulation
results are presented to showcase the efficacy of our solutions.
I. INTRODUCTION
Pursuit-evasion and target defense games have received a
significant amount of attention due to their close connection
with a wide range of applications in, for example, aerospace,
military, and robotics. Not many attempts, however, have
yet been made to address hybrid problems that possess
both pursuit-evasion and target defense aspects, such as
aerial reconnaissance and coast guarding problems. In these
problems, the goal of an intruding agent is not necessarily to
attack or reach a target, but to perform reconnaissance tasks
in the vicinity of it and then escape to a safe zone before
being neutralized by its opponent. This paper presents how to
formulate this type of problems as a two-staged differential
game and find equilibrium (i.e., worst-case) control policies
for the agents involved.
Literature review: The study of pursuit-evasion games was
initiated by Isaacs in his seminal work [1]. Recent advances
in the field of pursuit-evasion games are summarized in [2].
One remarkable variant of pursuit-evasion games is the so-
called target guarding game, the two-player version of which
was first discussed in [1]. For the past few years, target
guarding games with a stationary target or territory have
been extensively studied. The related work in the literature
can be categorized based on the shape of the target under
consideration, which can be a point [3], [4], a line [5]–
[7], a circle [8], or an arbitrary convex set [9]–[12]. One
class of target guarding games that has recently attracted
the attention of many researchers is the so-called perimeter-
defense game [13], [14], which involves a defender(s) whose
state is constrained along the boundary or perimeter of a
target. The turret-defense game, which is a special case
that has a circular target, has recently been studied in [15].
Y. Lee (PhD student) and E. Bakolas (Associate Professor) are with the
Department of Aerospace Engineering and Engineering Mechanics, The
University of Texas at Austin, Austin, Texas 78712-1221, USA, Emails:
yol033@utexas.edu; bakolas@austin.utexas.edu
Another remarkable class of target guarding games is the
active target defense game, which involves a mobile target
[16] or a maneuverable target (i.e., an evader) [17]–[19].
In target guarding games, an intruder aims at reaching a
target or, if capture is unavoidable, approach the target as
close as possible until the time of capture. This problem
formulation is relevant to the problem considered in this
paper, namely the reconnaissance game. Similar to target
guarding games, the reconnaissance game involves an in-
truder whose goal is to minimize its distance to the target.
The difference, however, is that the intruder in the latter game
has an additional goal which is to enter a safe zone before
being captured by the defender. The fact that the intruder
has dual goals makes the game difficult to analyze with the
classical differential game approach. To our best knowledge,
[20] was the first work that studied an open-loop solution of
the reconnaissance game with two agents and a point target
using the method of game decomposition. This method has
later been revisited for many other problems that have more
than one stages, such as the fixed-course target observation
problem [21], the engage or retreat game [22], and the
capture-the-flag game [23], [24]. Note that the key difference
between the capture-the-flag game and the reconnaissance
game is that the latter game always terminates with the
intruder entering the safe zone, whereas the former game
has no such hard terminal constraint.
Statement of contributions: The main contributions of this
paper are as follows. Compared to [20] in which only the
point target case was considered and an open-loop solution
was developed in part based on iterative search methods,
we develop the complete solution for the reconnaissance
game involving two agents and half-planar target and retreat
regions completely analytically. In particular, our method
yields the solution to the game, namely the Value function
and state-feedback equilibrium strategies of the game, in
closed-form instead of relying on algorithmic or numerical
methods as in [20]. Furthermore, we rigorously verify the
validity of our solution using the Hamilton-Jacobi-Isaacs
(HJI) equation, a step that has also been absent in [20].
Lastly, using the analytical expression of the Value function,
we show how to divide the state space of the game into the
winning sets of each agent and also how to characterize the
barriers that demarcate these winning sets.
Outline: The rest of the paper is structured as follows.
In Section II, the reconnaissance game is formulated and
decomposed into two phases. In Sections III and IV, the
solution of the game corresponding to each phase is de-
veloped based on differential game methods. In Section
V, simulation results are presented. Finally, in Section VI,
concluding remarks are provided.
arXiv:2210.01364v2 [eess.SY] 21 Mar 2023
II. PROBLEM FORMULATION
In this section, the two-player reconnaissance game that
takes place in R2is formulated. The game involves two
mobile agents, the Intruder (I) and the Defender (D), whose
equations of motion are given as
˙xI= cos φ, xI(0) = x0
I,(1)
˙yI= sin φ, yI(0) = y0
I,(2)
˙xD=αcos ψ, xD(0) = x0
D,(3)
˙yD=αsin ψ, yD(0) = y0
D,(4)
where [xI, yI]>R2,[x0
I, y0
I]>R2\(T ∪ R), and φ
[π, π](resp., [xD, yD]>R2,[x0
D, y0
D]>R2\R, and
ψ[π, π]) denote the state/position, initial state/position,
and control input/heading angle of the Intruder (resp., De-
fender), respectively; note that the sets Tand Rwill be
defined shortly. The scalar αdenotes the speed of the
Defender or the speed ratio, where α > 1(in other words,
the Defender is faster than the Intruder). The dynamics of
the game can be written as
˙
x=f(x, φ, ψ),x(0) = x0,(5)
where x= [xI, yI, xD, yD]>R4is the game state, x0=
[x0
I, y0
I, x0
D, y0
D]>R4is the initial game state, and f:R4×
[π, π]×[π, π]R4is the (continuously differentiable)
vector field of the game dynamics.
In this game, the Intruder is interested in reconnoitering
(i.e., minimizing its distance to) the target region T:
T=[x, y]>R2:yl,(6)
where l > 0. After performing the reconnaissance task, the
same agent is required to return to the retreat region R:
R=[x, y]>R2:y0.(7)
By construction, Tand Rare closed half-spaces in R2(i.e.,
half-planes) that are disjoint. The Intruder is said to win the
game if it enters R, which is equivalent to say, if the game
state reaches the terminal manifold Tr:
Tr=xR4:yI0.(8)
The Defender, on the other hand, is considered to win if it
captures (with zero capture radius) the Intruder outside R.
The terminal manifold for capture, Tc, is defined as
Tc=xR4\Tr: (xIxD)2+ (yIyD)2= 0.(9)
Note from (8) and (9) that the Intruder is considered to win
if it is “captured” on the boundary of R.
Since the game of our interest requires that the Intruder
return to Rbefore capture occurs, we will focus on the
scenario where the game state ends up in Tr. In this case,
the objective of the Intruder is to minimize its distance to T
(at some time instant during the game) as much as possible
in the presence of the Defender that strives to maximize the
same distance. The payoff functional of the reconnaissance
game can thus be defined by
J= min
t[0,tf]dist([xI(t), yI(t)]>,T),(10)
where tf= inf {t0 : xTr}is the final time and the
function dist : R2×2R2[0,)measures the (minimum)
T
R
D
I
l
x
y
(a) Phase I (Approach Phase)
D
I
T
R
d
l
x
y
(b) Phase II (Retreat Phase)
Fig. 1: Illustration of the two phases of the reconnaissance game.
During Phase I, the Intruder (red triangle) approaches the target
region T(gray half-plane) to minimize its distance from the latter
region as much as possible in the presence of the Defender (teal
circle). When the Intruder realizes it should no longer proceed or
otherwise capture may occur before entering the retreat region R
(purple half-plane), the game transitions to Phase II, in which the
Intruder’s goal is to arrive at the point in Rwhere the distance
between the two agents is at its maximum. Note that the resulting
payoff of the illustrated case is d.
distance between a point and a subset of R2. The Value of
the game is
V(x0) = min
φ(·)max
ψ(·)J, (11)
where φ(·)and ψ(·)denote the state-feedback strategies of
each agent.
The goal of this paper is to find 1) the winning regions of
each agent and 2) the Value function of the game, V, and the
corresponding state-feedback equilibrium strategies, φ?(·)
and ψ?(·). By definition, the pair of equilibrium strategies
must satisfy the saddle-point condition
J(φ?(·), ψ(·); x0)J(φ?(·), ψ?(·); x0)
J(φ(·), ψ?(·); x0),(12)
for all possible φ(·)and ψ(·).
Due to the non-integral payoff functional in (10), it is
difficult to analyze the game in its current form. For this rea-
son, as suggested in [20], we will decompose the game into
two phases based on the Intruder’s myopic goal: approach or
retreat. See Figure 1 for the illustration of Phase I (approach
phase) and Phase II (retreat phase). In the following sections,
we will solve Phase II first and then Phase I, as the solution
of Phase I depends partially on that of Phase II.
III. PHASE II: RETREAT PHASE
Suppose at some time ts[0, tf], the Intruder has
arrived at the point closest to T. Simultaneously, the game
transitions from Phase I to Phase II. In the latter phase, in
order to reduce the chance of capture, the Intruder desires
to maximize its distance from the Defender at the moment
it reaches a point in R, which we refer to as a retreat point.
Conversely, the Defender attempts to increase the chance
of capture by minimizing the same distance. The payoff
functional of this phase is therefore defined as
JII =q(xf
Ixf
D)2+ (yf
Iyf
D)2,(13)
where xf= [xf
I, yf
I, xf
D, yf
D]>is the game state at t=tf.
Note that the game defined by the payoff (13) and terminal
摘要:

Two-PlayerReconnaissanceGamewithHalf-PlanarTargetandRetreatRegionsYoonjaeLeeEfstathiosBakolasAbstract—Thispaperisconcernedwiththereconnaissancegamethatinvolvestwomobileagents:theIntruderandtheDefender.TheIntruderistaskedtoreconnoiteraterritoryofinterest(targetregion)andthenreturntoasafezone(retreatr...

展开>> 收起<<
Two-Player Reconnaissance Game with Half-Planar Target and Retreat Regions Yoonjae Lee Efstathios Bakolas.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:6 页 大小:418.71KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注