ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION RENS KAMPHUIS MICHEL MANDJES AND PAULO SERRA

2025-05-04 0 0 979.54KB 47 页 10玖币

侵权投诉

ROAD TRAFFIC ESTIMATION AND

DISTRIBUTION-BASED ROUTE SELECTION

RENS KAMPHUIS, MICHEL MANDJES, AND PAULO SERRA

Abstract. In route selection problems, the driver’s personal preferences will determine

whether she prefers a route with a travel time that has a relatively low mean and high

variance over one that has relatively high mean and low variance. In practice, however,

such risk aversion issues are often ignored, in that a route is selected based on a single-

criterion Dijkstra-type algorithm. In addition, the routing decision typically does not take

into account the uncertainty in the estimates of the travel time’s mean and variance. This

paper aims at resolving both issues by setting up a framework for travel time estimation.

In our framework, the underlying road network is represented as a graph. Each edge is

subdivided into multiple smaller pieces, so as to naturally model the statistical similarity

between road pieces that are spatially nearby. Relying on a Bayesian approach, we construct

an estimator for the joint per-edge travel time distribution, thus also providing us with

an uncertainty quantiﬁcation of our estimates. Our machinery relies on establishing limit

theorems, making the resulting estimation procedure robust in the sense that it eﬀectively

does not assume any distributional properties. We present an extensive set of numerical

experiments that demonstrate the validity of the estimation procedure and the use of the

distributional estimates in the context of data-driven route selection.

Keywords. Road traﬃc network ◦estimation ◦shortest-path problems ◦route selection

Affiliations. Rens Kamphuis and Michel Mandjes are with the Korteweg-de Vries Insti-

tute for Mathematics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, the

Netherlands. MM is also with Eurandom, Eindhoven University of Technology, Eindhoven,

the Netherlands, and Amsterdam Business School, Faculty of Economics and Business, Uni-

versity of Amsterdam, Amsterdam, the Netherlands. Their research is partly funded by the

NWO Gravitation project Networks, grant number 024.002.003.

Paulo Serra is with the Department of Mathematics, Vrije Universiteit, De Boelelaan 1111,

1081 HV Amsterdam, the Netherlands.

Date: October 5, 2022.

arXiv:2210.01469v1 [math.OC] 4 Oct 2022

2 RENS KAMPHUIS, MICHEL MANDJES, AND PAULO SERRA

1. Introduction

A central problem drivers in a road network are faced with concerns the choice between mul-

tiple possible routes in order to travel from their current location to some desired destination.

The analysis of such shortest-path problems on a network has a long tradition in operations

research. A typical procedure is to consider the per-edge mean travel times, and to apply a

Dijkstra-type [8] algorithm to ﬁnd the fastest route from origin to destination, i.e., the route

that minimizes the expected travel time. A conceptual drawback of this approach, however,

is that travel times are inherently stochastic. This means that the route that has the shortest

expected travel time could also have a substantial standard deviation – in fact, there may be

a route with a higher expected travel time with virtually no variability. In such a situation

it is up to the driver to make a choice: depending on her personal preferences (in terms of

risk aversion) and the importance of the planned trip, she will choose the best alternative. A

convenient framework facilitating making such decision uses the concept of utility functions

[32]; see also e.g., [25, 33]. Such a utility function could encompass both mean and standard

deviation of the travel time, but in principle any distribution-based quantity. A risk averse

driver could for instance pick the route that minimizes the 95%-quantile of the travel time.

A second conceptual diﬃculty concerns the way statistical uncertainty is dealt with. If one

would aim at identifying the route that optimizes the utility, expressed in terms of a given

distribution-based feature of the travel time, it is implicitly assumed that one knows the

underlying distribution with certainty. In reality, however, the travel times pertaining to

the various routes have to be estimated from historic data, necessarily leaving us with some

amount of uncertainty. Ignoring this uncertainty, the objective would be to ﬁnd the route the

optimizes the chosen utility function. In a framework accounting for parameter uncertainty,

however, the ambition would be to add an uncertainty quantiﬁcation to this claim. In this

context a meaningful statement could be of the type ‘The probability is x% that the travel

time distribution of route A corresponds to a higher utility than the one of route B.’

The main contribution of this paper lies in the development of a broadly applicable framework

for travel time estimation in any road traﬃc network, which is rich enough to also assess the

inherent estimation uncertainty. In addition, the performance of our estimation procedure is

quantiﬁed through a series of numerical experiments, some of them featuring (data-driven)

route selection. Evidently, to determine the optimal route, one has to have a good description

of the current state of the road network in terms of the congestion level. As is commonly done,

we will treat the network as an undirected graph, where the vertices denote intersections

and where the edges connecting these vertices imply the existence of a road between these

intersections. In this case, describing the state of the network amounts to estimating the

joint per-edge travel time distribution. It is clear that one should not assume that on an edge

the level of congestion is evenly spread. Instead, within edges one expects a strong similarity

between the congestion levels of spatially nearby road pieces. Moreover, as drivers typically

slow down when approaching an intersection, it is to be expected that the velocities near

an intersection will be similar for all roads that cross at this intersection but may otherwise

vary along the roads represented by those edges. It is part of our approach to incorporate

these basic features into our estimation procedure, so as to obtain more accurate estimates

of the travel time distribution.

ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION 3

There is a vast body of literature on estimation techniques for the travel time distribution.

Clearly, mean travel times (of a path in the graph, that is) can be derived directly from the

mean travel times of the constituent edges, but a major complication is that this property

does not carry over to the full travel time distribution or to higher moments. As a con-

sequence, one cannot straightforwardly use techniques for per-edge travel time distribution

estimation, such as those discussed in e.g. [19, 20, 38], to develop an estimation procedure for

the path-level travel time distribution. This issue has been resolved in e.g. [22, 30, 36], but

typically at the expense of imposing relatively ﬁrm assumptions on the functional form of

the per-edge travel time distributions as well as the underlying correlation structure. In [26]

a generalized Markov chain approach has been proposed that estimates the path-level travel

time distribution incorporating correlations in time and space. In [29] a non-parametric

method is developed that is particularly suited to scenarios in which the travel time dis-

tributions vary over time. Ideally, one would like to have a method of (i) relatively low

computational complexity, that is (ii) robust in the sense that it does not rely on heavy

distributional assumptions, that is (iii) applicable to graphs of any form, also exploiting ev-

ident intrinsic properties (such as the ones discussed in the preceding paragraph), and that

(iv) provides us with an uncertainty quantiﬁcation of the resulting estimates.

Shortest-path problems have a long history in the operations research and combinatorics

literature, with Dijkstra’s seminal contribution [8] as an important landmark. Various ex-

tensions followed. Without aiming at providing an exhaustive overview, we mention a few

important contributions; for an in-depth account see e.g. [1]. A notable generalization, due

to Bellman and Ford [4, 10], concerns graphs with negative edge weights (assuming, for

obvious reasons, no negative cycle can be reached from the source vertex). The so-called

A?algorithm aims at reducing the subgraph that must be explored [17]. In e.g. [16, 28] the

focus is on networks in which the edges have a time-dependent length. Variants in which

the edge lengths attain random values can be found in for instance [5, 18]. In [3] the focus

is on adapting Dijkstra’s algorithm to the setting of so-called and-or graphs.

We proceed with a more detailed account of our contributions. In our modeling framework we

represent the road network as a graph, but in our estimation approach we use a version of this

graph that is endowed with a higher resolution, i.e., a graph in which each edge is broken up

into multiple smaller pieces, each representing a segment of a road. The idea behind working

with this high-resolution graph is that it allows us to naturally model the statistical similarity

between road pieces that are spatially nearby. Following a Bayesian approach, we construct

an estimator for the joint per-edge travel time distribution, thus also providing us with

an uncertainty quantiﬁcation of our estimates. The framework used relies on establishing

various limit theorems, making the estimation procedure robust (in the sense that it only

very mildly relies on distributional assumptions). The underlying numerics involve basic

computational algorithms, predominantly standard routines stemming from linear algebra.

Our proposed estimation procedure thus fulﬁls the desirable properties (i)–(iv). The paper

also includes an extensive set of numerical experiments by which we thoroughly validate our

approach. In addition, we demonstrate the use of the distributional estimates in the context

of data-driven route selection: in a series of examples we determine the optimal route from

a set of given potential routes, and illustrate how this route is aﬀected by the choice of the

utility function, and hence by the driver’s preferences.

4 RENS KAMPHUIS, MICHEL MANDJES, AND PAULO SERRA

The remainder of this paper has been organized as follows. Section 2 introduces our nota-

tion and model, and deﬁnes what our dataset is. Then, in Section 3 we detail our inference

procedure, subsequently considering the mean, covariance structure, and a smoothing pa-

rameter λ. As pointed out in Section 4 assumptions on the mean, covariance, and graph

Laplacian need to be imposed to make sure that the procedure of Section 3 is consistent.

Section 5 discusses an extensive set of numerical experiments that have been set up so as

to validate the estimation procedure. Then in Section 6 it is pointed out how our approach

can be applied in the context of route selection. Finally, Section 7 includes a discussion and

concluding remarks. Technical proofs are collected in an appendix.

2. Notation, model, observations

In this section we introduce the road traﬃc network considered, including the notation that

we use throughout this paper. In addition, we provide a model for the data collected from

this network.

2.1. Some notation. In this subsection we introduce the graph representation of our road

network, including its high-resolution version.

2.1.1. Notation for the traﬃc network. We represent the road network by an undirected

graph, consisting of vertices that are connected by edges. This graph is, as usual, denoted by

G= (V, E) with V={v1, . . . , vp}being the set of p=|V| ∈ Nvertices and E={e1, . . . , eq}

the set of q=|E| ∈ Nedges, where | · | denotes the cardinality of the underlying set. For

obvious reasons, we throughout assume that the graph Gis connected. We order the vertices

and edges so that we can also identify each vertex viand edge ejwith their indices iand j,

respectively, with i∈ {1, . . . , p}and j∈ {1, . . . , q}.If this is convenient we sometimes write

V={1, . . . , p}so that E⊆ {{i, j} ∈ V2}, but we also use the notation E={1, . . . , q}.

For an edge e={i, j} ∈ Ewe write va(e) = min{i, j}and vo(e) = max{i, j}for the

corresponding vertices of the edge.

We denote by A∈ {0,1}p×pthe adjacency matrix of the graph G, i.e., a p×pmatrix

whose entries indicate whether the corresponding pair of vertices is adjacent or not. More

concretely, for i, j = 1, . . . , p,

Ai,j =1 if {i, j} ∈ E,

0 otherwise. (1)

We write d={d1, . . . , dp}={dv:v∈V} ∈ Npto represent the degrees of the vertices in

V, so that

di=

j=1

Ai,j,(2)

and deﬁne D= diag{d}. The Laplacian matrix of the graph Gis deﬁned by L=D−A∈

Zp×p. The diagonal of this matrix consists of the vertices’ degrees, and the (i, j)-th non-

diagonal entry is −1 if vertices iand jare adjacent and 0 otherwise. At several occasions we

want to emphasize in the notation the dependence of certain objects on the underlying graph

in the notation. We then write V=V(G), E=E(G), A=A(G), d=d(G), D=D(G)

or D= diag{d(G)}, and L=L(G) = D(G)−A(G).

ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION 5

For a graph Gwe deﬁne the line graph of Gas another graph ¯

G= ( ¯

V , ¯

E), where each vertex

in ¯

Vnow corresponds to an edge in G, and where two vertices in ¯

Gare connected by an

edge if, and only if, the corresponding edges in Gare incident. Quantities relating to the

line graph of Gare denoted with a bar above the quantities; for instance, in our approach

we intensively make use of the Laplacian matrix of the line graph of Gwhich we denote as

The graph Grepresents a traﬃc network across which particles, to be thought of as cars,

are ﬂowing. In this network, each particle enters the system at some vertex v∈V, follows

a path to some v0∈V, and then leaves the system. Each particle takes a certain amount of

time to cross each edge e∈Eon its path, reﬂecting the current congestion level of that edge.

As pointed out in the introduction it is our objective to infer the travel time distribution

pertaining to a given route. Importantly, we wish to do so without a priori assuming that

all edges have the same level of congestion and that particles traverse edges at a constant

velocity. In addition we wish to work in a framework by which we can naturally model

the statistical similarity between road pieces that are spatially nearby. To facilitate these

requirements it is convenient to subdivide the edges in the traﬃc network into smaller pieces.

For this we consider a higher resolution version of the traﬃc network.

2.1.2. The traﬃc network in higher resolution. We proceed by pointing out how we increase

the resolution of the edges. To this end, for the graph G= (V, E) we consider a collection

r={re:e∈E}={r1, . . . , rq} ∈ Nq

0of user speciﬁed resolution parameters. For each such

collection r, consider the graph Gr= (Vr, Er) where

Vr:= V∪[

i:ei∈Evi,1, . . . , vi,riand Er:= [

i:ei∈Eei,1, . . . , ei,ri+1,(3)

where the edges ei,j in Erare given by, for j= 2, . . . , riand i= 1, . . . , q,

ei,1=va(ei), vi,1, ei,j =vi,j−1, vi,j , ei,ri+1 =vi,ri, vo(ei).(4)

We also denote the vertices of Grby Vr={v1, . . . , vp}∪{vi,j :j= 1, . . . , ri, i = 1, . . . , p}

and its edges by Er={ei,j :j= 1, . . . , ri+ 1, i = 1,...q}. In the graph Grwe deﬁne the

cardinalities

pr:= |Vr|=|V|+X

e∈E

re=p+

i=1

ri,

qr:= |Er|=X

e∈E

(re+ 1) = |E|+X

e∈E

re=q+

i=1

ri.

(5)

We think of the graph Gras a higher resolution version of G:Gris constructed from Gby

replacing each edge e∈Efrom Gby a path graph with re+ 1 new edges connecting the

original vertices va(e) and vo(e) from G. We also assume that ris such that the lengths

of the road segments corresponding to any edge in Gris (approximately) the same; it will

become clear in Section 4.1 why we impose this requirement. Figure 1 provides a conceptual

illustration of the graph Gand its high-resolution version Gr. The higher resolution traﬃc

network Grthus allows us to model the time that a particle takes to traverse each edge ein

Gin more detail by breaking it down into re+ 1 smaller travel times. The number re+ 1

encodes the number of measurements we can collect while the particle moves along e, and

so we can think of it as a resolution parameter.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ROADTRAFFICESTIMATIONANDDISTRIBUTION-BASEDROUTESELECTIONRENSKAMPHUIS,MICHELMANDJES,ANDPAULOSERRAAbstract.Inrouteselectionproblems,thedriver'spersonalpreferenceswilldeterminewhethersheprefersaroutewithatraveltimethathasarelativelylowmeanandhighvarianceoveronethathasrelativelyhighmeanandlowvariance.In...

展开>> 收起<<

ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION RENS KAMPHUIS MICHEL MANDJES AND PAULO SERRA.pdf

共47页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION RENS KAMPHUIS MICHEL MANDJES AND PAULO SERRA

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: