ROAD TRAFFIC ESTIMATION AND DISTRIBUTION-BASED ROUTE SELECTION 3
There is a vast body of literature on estimation techniques for the travel time distribution.
Clearly, mean travel times (of a path in the graph, that is) can be derived directly from the
mean travel times of the constituent edges, but a major complication is that this property
does not carry over to the full travel time distribution or to higher moments. As a con-
sequence, one cannot straightforwardly use techniques for per-edge travel time distribution
estimation, such as those discussed in e.g. [19, 20, 38], to develop an estimation procedure for
the path-level travel time distribution. This issue has been resolved in e.g. [22, 30, 36], but
typically at the expense of imposing relatively firm assumptions on the functional form of
the per-edge travel time distributions as well as the underlying correlation structure. In [26]
a generalized Markov chain approach has been proposed that estimates the path-level travel
time distribution incorporating correlations in time and space. In [29] a non-parametric
method is developed that is particularly suited to scenarios in which the travel time dis-
tributions vary over time. Ideally, one would like to have a method of (i) relatively low
computational complexity, that is (ii) robust in the sense that it does not rely on heavy
distributional assumptions, that is (iii) applicable to graphs of any form, also exploiting ev-
ident intrinsic properties (such as the ones discussed in the preceding paragraph), and that
(iv) provides us with an uncertainty quantification of the resulting estimates.
Shortest-path problems have a long history in the operations research and combinatorics
literature, with Dijkstra’s seminal contribution [8] as an important landmark. Various ex-
tensions followed. Without aiming at providing an exhaustive overview, we mention a few
important contributions; for an in-depth account see e.g. [1]. A notable generalization, due
to Bellman and Ford [4, 10], concerns graphs with negative edge weights (assuming, for
obvious reasons, no negative cycle can be reached from the source vertex). The so-called
A?algorithm aims at reducing the subgraph that must be explored [17]. In e.g. [16, 28] the
focus is on networks in which the edges have a time-dependent length. Variants in which
the edge lengths attain random values can be found in for instance [5, 18]. In [3] the focus
is on adapting Dijkstra’s algorithm to the setting of so-called and-or graphs.
We proceed with a more detailed account of our contributions. In our modeling framework we
represent the road network as a graph, but in our estimation approach we use a version of this
graph that is endowed with a higher resolution, i.e., a graph in which each edge is broken up
into multiple smaller pieces, each representing a segment of a road. The idea behind working
with this high-resolution graph is that it allows us to naturally model the statistical similarity
between road pieces that are spatially nearby. Following a Bayesian approach, we construct
an estimator for the joint per-edge travel time distribution, thus also providing us with
an uncertainty quantification of our estimates. The framework used relies on establishing
various limit theorems, making the estimation procedure robust (in the sense that it only
very mildly relies on distributional assumptions). The underlying numerics involve basic
computational algorithms, predominantly standard routines stemming from linear algebra.
Our proposed estimation procedure thus fulfils the desirable properties (i)–(iv). The paper
also includes an extensive set of numerical experiments by which we thoroughly validate our
approach. In addition, we demonstrate the use of the distributional estimates in the context
of data-driven route selection: in a series of examples we determine the optimal route from
a set of given potential routes, and illustrate how this route is affected by the choice of the
utility function, and hence by the driver’s preferences.