
target set, and the weighted potential loss as the measurement.
In this work, we focus on a class of Mirror Descent
Non-Equilibrium learning algorithms and elaborate on its
role in the resilience of traffic networks under adversarial
environments. We first establish a high probability bound
on the distance between the output and MWE for generic
Mirror Descent (MD) algorithms without assumptions on the
boundedness of the latency function. This high probability
bound can be transformed into the resilience metrics, showing
that after the attack, a WANES with weighted potential loss
that is sublinear in time can be recovered through learning.
Next, we develop a learning-based resiliency mechanism
based on an MD algorithm as the and two classes of
flow disturbance attacks. We demonstrate the performance
resilience under MD using an evacuation case study to
illustrate the process of learning-based recovery. A schematic
illustration of our non-equilibrium learning approach is
provided in Figure 1.
Outline of this paper.
We briefly discuss the related
works in Section II. In Section III, we introduce the repeated
stochastic congestion game and MWE as the solution concept,
based on which we introduce the notion of Non-Equilibrium
learning and the formalism of resilience. In Section IV, we
establish several finite-time results for the Non-Equilibrium
learning dynamics and elaborate on the numerical experiments
to illustrate the attack and resilience in Section V.
II. RELATED WORK
Our work bridges the gap between the online-learning and
resilience in traffic assignment. Traffic assignment naturally
fits in the online-learning framework, see [
8
], [
9
], [
10
], where
the convergence in Ces
´
aro sense is shown as interests. Vu
et al. in [
11
] improved the rate of Ces
´
aro convergence to
O(1/T 2)and provided a last-iterate convergence guarantee.
The existent studies on the resilience of traffic assignment
have targeted on event-based disruptions, (see, e.g., [
12
]),
the system impedance to misinformation disturbance has
been rarely studied. To the best of our knowledge, the first
informational equilibrium-poisoning concept was proposed by
Pan et al. [
2
], such a phenomenon occurs when the sensors,
GPS devices in the INS are under attacks [13].
III. PROBLEM FORMULATION
A. Preliminary: Mean Wardrop Equilirbium
We are given a traffic network represented as a directed,
finite, and connected graph
G= (V,E)
without self-loops.
The vertices
V
represent road junctions, and the edges
E
represent road segments. The set of distinct origin-destination
(OD) pairs is
W ⊆ V ×V
, indexed by
w
, with cardinality
W
.
Let
P:= Sw∈W Pw
be the set of all directed paths between
origins and destinations, where
Pw⊆ P(E)
is the path set
between the pair w.
We assume that there is a set of infinitesimal players
over
G
, denoted by a measurable space
(X,M, m)
. The
players are non-atomic, i.e.,
m(x) = 0 ∀x∈ X
; they
are split into distinct populations indexed by the OD pairs,
i.e.,
X=Sw∈W Xw
and
XwTXw0=∅,∀w, w0∈ W
.
For each OD pair
w∈ W
, let
mw=m(Xw)
represent
the traffic demand. Let
¯
M=Pw∈W mw
For each player
x∈ Xw
, we assume that their travel path
a∈ Pw
is fixed
right after the path selection. The action profile of all the
players
X
induces an edge flow vector
q∈R|E|
≥0
, where
qe:= RX1{e∈a}m(dx), e ∈ E
, and a path flow vector
µ∈∆ := {(µp)p∈∪w∈W Pw|µp:= RXw
1{a=p}m(dx)}.
We define the edge-path incident matrix of graph
G
as
Λ =
[Λ1|,...,|Λ|W|]∈R|E|×|P|
such that
Λw
e,p =1{e∈p},∀e∈
E, w ∈ W, p ∈ Pw
. Hence the compact form of edge-path
flow relation is q= Λµ.
Let
(Ω,F,P)
be the probability space for the formalism,
le:R≥0×Ω7→ R+
be the cost/latency functions, measuring
the travel delay of the edge
e∈ E
determined by its edge flow
qe
and a state variable
ω∈Ω
that is universal for the entire
traffic network, e.g.,
ω
can represent the weather condition,
road incidents or anything that affects the congestion level. Let
l:R|E|
≥0×Ω7→ R|E|
+
denote the vector-valued latency function.
For an instance
ω∈Ω
, the latency of path
p
is defined as
`p:= Pe∈ple(qe, ω)=Λ>
pl(Λµ, ω)
, which can be seen as
a function of
µ
and
ω
, written as
`p=`p(µ, ω)
. We write
the vector-valued path latency function as
`: ∆ ×Ω7→ R|P|
+
.
Each instance
ω
determines a congestion game, captured by
the tuple
Gω
c= (G,W,X,P, `(·, ω))
. Each path flow profile
µ∈∆
induces a probability measure associated with the
positive random vector `(µ, ·):Ω7→ R|P|
+.
Assumption 1.
For all
e∈ E
, the latency functions
le
are
ω
-measurable, for all
ω∈Ω
,
le
are
L0
-Lipschitz continuous
and differentiable in qewith ∂le(qe, ω)
∂qe
>0for all qe≥0.
Remark.
The continuity assumption reflects the fact that
adding a small amount of traffic does not drastically affect
the travel latency; the monotonicity implies the increments
of traffic does not decrease the latency.
We adopt a stochastic alternative definition of Wardrop
equilibrium. In doing so, we consider a “meta” version of the
congestion game,
Gc= (G,W,X,P,Eω[`(·, ω)])
, with the
utility functions replaced by the expected latency function.
This “meta” congestion game gives rise to a solution concept
corresponding to Definition 1.
Definition 1
(Mean Wardrop Equilibrium [
1
])
.
A path flow
µ∈∆
is said to be a Mean Wardrop Equilibrium (MWE) if
∀w∈ W
,
µp>0
indicates
E[`p]≤E[`p0]
for all
p0∈ Pw
.
The set of all MWE is denoted by
µ∗
. Equivalently,
µ∈µ∗
if and only if the following variational inequality is satisfied:
Eω[hµ−µ0, `(µ, ω)i]≤0,∀µ0∈∆
, where
Eω[·]
is the
expectation operator with respect to ω.
The MWE
µ∗
is in general not a singleton, but a convex set
given the strict monotonicity of latency in Assumption 1. The
seeking of MWE can be cast as a minimization problem of
the expectation of the Stochastic Beckmann Potential (SBP)
[
14
], defined as
φ(µ, ω) = Pe∈E R(Λµ)e
0le(z, ω)dz
, where
(Λµ)e
is the
e
th element of
Λµ
. We refer to the expectation