IEEE 1 Encoder-Decoder RNNs for Bus Arrival Time Prediction

2025-05-08 0 0 1.78MB 15 页 10玖币
侵权投诉
IEEE 1
Encoder-Decoder RNNs for Bus Arrival Time
Prediction
Nancy Bhutani§, Soumen Pachal§, and Avinash Achar*§
Abstract—Arrival/Travel times for public transit exhibit vari-
ability due to factors like seasonality, traffic signals, travel
demand fluctuation etc. The developing world in particular is
plagued by additional factors like lack of lane discipline, excess
vehicles, diverse modes of transport, unreliable schedules etc.
This renders the bus arrival time prediction (BATP) to be a
challenging problem especially in the developing world. A data-
driven model based on a novel variant of Encoder-Decoder
(OR Seq2Seq) recurrent neural networks (RNNs) is proposed
for BATP (in real-time). The model intelligently incorporates
spatio-temporal (ST) correlations in a unique (non-linear) fashion
distinct from existing approaches. Existing Encoder-Decoder
(ED) approaches for BATP blindly map time to the sequential
aspect of ED, while ignoring crucial data characteristics and
making some restrictive model assumptions. Our approach in
contrast is not straightforward and effectively tackles these issues.
We exploit the geometry of the dynamic real-time BATP problem
to enable a novel fit with the ED structure, distinct from existing
ED approaches. Further motivated from accurately modelling
past congestion influences from downstream sections, we addi-
tionally propose a bidirectional layer at the decoder (something
unexplored in other time-series based ED application contexts).
The effectiveness of the proposed architecture is demonstrated
on real field data collected from challenging traffic conditions,
while bench-marking against state-of-art baselines. The proposed
architecture is not limited to transportation, but can also be
employed for multi-step time-series forecasting (sales/demand
forecasting under exogenous inputs like price).
Index Terms—Encoder-Decoder, Nonlinear Predictive Mod-
elling, Recurrent Neural Networks, Travel-Time Prediction.
I. INTRODUCTION
Public transit system is a crucial component to adminis-
ter the overall transport system in urban cities across the
world. Having a quality system would make it attractive for
commuters and can in-turn mitigate mounting traffic volumes
and congestion levels, which is a universal problem across
the urban world. A quality system would entail sticking to
well-designed schedules to the extent feasible while providing
quality predictions in real-time. Such accurate estimates can
help commuters better plan their bus-stop arrival while avoid
unnecessary wait times. Quality Bus travel time predictions
can also assist commuters decide between taking a bus or some
other alternate mode of transport. Quality BATP estimates can
also benefit travel administrators take corrective action when
bus schedules are violated.
* Avinash Achar is the corresponding author.
§All authors contributed equally.
Nancy Bhutani, Soumen Pachal, Avinash Achar are with TCS Research,
Chennai, INDIA. E-mail: {nancy.9,s.pachal,achar.avinash}@tcs.com.
The manuscript was first submitted in Aug 2023 for review.
This paragraph will include the Associate Editor who handled your paper.
BATP literature is more than one and a half decades old. It
continues to be a challenging research problem in the devel-
oping world, in particular. Factors contributing to this include
(1) absence of lane discipline (2) in-homogeneity of traffic
(i.e. transport modes can include bicycles, two wheelers, four
wheelers, heavy vehicles like trucks, buses and so on) with
dedicated lanes absent for specific modes of transport. We refer
to this as mixed traffic conditions. Another issue is the lack of
reliability of bus schedules [42], especially in the context of
India where most cities experience mixed traffic conditions.
The bus schedules even if available tend to get outdated due
to constant changes in traffic conditions and infrastructure.
This makes timetables extremely unreliable, leading to ad-hoc
waiting times for passengers. Hence providing accurate arrival
time predictions become even more important in such cases.
The real data considered in this paper is from a bus route
in Delhi [42], the capital of India which experiences mixed-
traffic conditions. Any google-map based bus arrival time
query (in most cities in India including Delhi) has till recently
mostly returned a constant prediction independent of the date
or time of query. These constant estimates seem to be based
on some pre-fixed schedules which are unfollowable given the
complex traffic conditions as explained above. Further, ETA
(Expected time of arrival) solutions are still being continuously
improved upon by Google at a network level across the world
for different modes of urban transport [1], [2]. On account of
the above factors, BATP continues to be a challenging research
problem [3] especially under mixed traffic conditions [4], [5].
Over the years, there have been diverse approaches proposed
for BATP. Data-driven approaches have been a dominant class
of methods for BATP.
In most of these approaches, an entire route is segmented
into smaller sections(or segments) either uniformly [6] OR
into non-uniform segments connecting consecutive bus-stops.
Depending on the method and the installed sensing infrastruc-
ture, the data input can be entities like speed, density, flow,
travel time etc. In this work, we consider scenarios where
input data comes only from travel times experienced across
these segments/sections. AVL (automatic vehicle location) data
captured by GPS sensing can easily provide such travel times.
Gaps and Contributions: Over the years, researchers have
explored a wide spectrum of methods under the broad umbrella
of data-driven approaches. These include ARIMA models [7],
linear statistical models like Kalman filters [4], [8], support
vector machines [9], [10], [11], feed-forward ANNs [12], [2],
recurrent neural networks [13], [14], CNNs [5], temporal dif-
ference learning[15] and so on. Most of the existing methods
suffer from a range of issues like (i)insufficient utilization
arXiv:2210.01655v2 [cs.LG] 31 Aug 2024
IEEE 2
of historical data for model calibration [16], [17], [6] OR
(ii)ignoring spatial correlations [8], [18], [10] (iii) not exploit-
ing temporal correlations([12], [19]) (iv) not exploiting real-
time information enough([20]), (v)segment the time-axis into
uniform bins, which can lead to inaccurate predictions[13],
[14]. There has been some recent work on exploiting spatio-
temporal correlations [4], [11], [5], [13], [14] as well in this
direction.
From an RNN literature perspective, there’s been re-
cent work where people have explored ED (also known as
Seq2Seq) architectures for real-valued data [21], [22] (time-
series (TS) in particular). A natural way to employ Seq2Seq
for BATP would be to segement the time-axis into uniform
bins [13], [14] and learn a sequential model in time. This
approach ignores the continuous nature of data in the temporal
dimension and makes an unrealistic assumption that section
travel times are constant across time bins (see Remark 2).
Our approach intelligently addresses this drawback while also
respecting the temporal continuity and space-time asymmetry,
in BATP data. Further, proposed approach is not evident due
to difference in structure of available data between BATP and
traditional time-series.
The current work proposes a novel ED architecture (differ-
ent from classical machine translation architecture OR existing
ED approaches for time-series) which is also distinct from all
existing BATP approaches (in particular from ED based BATP
approaches also [13], [14]). It exploits current real-time spatio-
temporal correlations and historical seasonal correlations for
nonlinear modelling and prediction. Specifically our contribu-
tions are as follows.
We intelligently recast the real-time BATP problem to a
novel and unique variant of Encoder-Decoder architec-
ture with real-time spatio-temporal inputs and historical
seasonal inputs carefully placed in the architecture. The
key is in recognizing that BATP inherently involves se-
quential training data with variable length input-output
pairs, which ED framework can handle. The proposed
ED model’s sequential aspect is mapped to the spatial
dimension of BATP, while the temporal aspects of BATP
are captured by feeding the associated inputs as decoder
inputs in a space synchronized fashion.
Travel times from the just traversed sections of current
bus (constitute the spatial correlations) are fed as inputs
to the encoder. While real-time information coming from
closest previous bus’s travel times across subsequent
sections (constituting temporal correlations) are fed in
a synchronized sequential fashion into the decoder as
additional inputs. Note that these synchronised inputs at
the decoder are absent in the classic ED application
for machine translation [23], [24]. The section travel
times (of current bus) across subsequent sections are the
prediction targets which are neatly mapped to the decoder
outputs sequentially. Weekly seasonal correlations are
also incorporated via additional inputs from the closest
trip of the previous week.
We propose a bidirectional layer at the decoder as this
can now capture (for a given section) the possible (up-
stream) influence of past congestions (in time) from sub-
sequent sections, propagating backward in space (along
the bus route). This novel feature of our ED variant is
unexplored in both traditional ED and other time-series
based variants of ED to the best of our knowledge.
Effectiveness of the proposed approach is illustrated on
actual field data collected from a route in Delhi, where
mixed traffic condition are very common. The results
clearly demonstrate superior performance of our approach
(across sub-routes of diverse lengths) in comparison to 4
recent state-of-art baselines.
The rest of the paper is organized as follows. Sec. II de-
scribes the related work in detail both from the perspective of
(i) BATP literature and (ii) ED based RNN approaches. Sec. III
explains the technical contribution in detail. In particular it
describes the proposed architecture and technically motivates
how the architecture can be derived to solve BATP. Sec. IV
demonstrates the effectiveness of the proposed architecture on
one route from Delhi by bench-marking against four carefully
chosen state-of-art baselines. We provide (i) a brief discussion
as to how our proposed architecture can be also be used in
other applications and (ii) concluding remarks in Sec. V
II. LITERATURE REVIEW AND RELATED WORK
The BATP literature has not only seen diversity in the range
of techniques used, but also in the kind of data input employed
for prediction. A range of data inputs like speed, travel times,
weather, flow information, crowd-sourced data[26], scheduled
time tables [25] and so on have been considered for BATP.
One can broadly categorize the range of approaches into
two classes: (i) traffic-theory based (ii) data-driven. Given
the proposed approach is data-driven, we stick to reviewing
related data-driven approaches. While Sec. II-A describes data-
driven approaches for BATP, Sec. II-B discusses related ED
based RNN approaches. Finally, Sec. II-C places the proposed
architecture in perspective of all related work.
A. Data-driven methods for BATP
Unlike the traffic-theory based approaches which model
the physics of the traffic, data-driven methods employ a
coarse model (based on measurable entities) that is sufficient
for predictive purposes. Most approaches learn to estimate
necessary parameters of a suitable predictive model from past
historical data, which is further employed for real-time BATP.
There are a few approaches which employ a data-based model
but do not perform full-fledged learning based on historical
data.
Without Learning: One of the earliest approaches without
learning was proposed in [27] using a Kalman filter. As inputs,
it used previous bus travel times and travel times from previous
day (same time). It has an arbitrary choice of parameter in its
state space model while only capturing temporal dependencies.
The subsequent approaches consider a linear state-space model
involving travel times and calibrate (or fix) the data-based
model parameters in real time. They choose their parameters
either based on (a) data from previous bus [16], [6] or (b)an
appropriate optimal travel-time data vector from the historical
data-base [28].
IEEE 3
Explicit Learning: As mentioned in the introduction, there
are a variety of approaches learning from historical data.
For instance, support vector regression [10] and feed-forward
ANNs [18] were employed to capture temporal correlations via
multiple previous bus travel times. Employing link length (a
static input) and rate of road usage and speed (dynamic inputs),
[20] proposes an SVR based prediction. However current bus
position OR previous bus inputs are not considered there.
A speed based prediction scheme is proposed in [29] which
uses a weighted average of current bus speed and historically
averaged section speed as inputs. As previous method, it
ignores information from previous bus. A dynamic SVR based
prediction scheme is proposed in [9] which exploits spatio-
temporal (ST) correlations in a minimal manner. In particular,
it considers current bus travel time at the previous section and
previous bus travel time at the current section.
A single feed-forward ANN model is built to predict travel
times between any two bus stops on the route in [12]. On
account of this, target travel time variable’s dynamic range
would be very large and can lead to poor predictions for very
short and very long routes. An approach using (non-stationary)
linear statistical models which captures ST correlations was
proposed in [4]. It uses a linear kalman filter for prediction.
Linear models here are used to capture spatial correlations.
The temporal correlations come from the (currently plying)
previous bus section travel time. Another approach using
linear statistical models and exploiting real-time temporal
correlations (from previous buses) was proposed in [8]. A
nonlinear generalization of [4] using support vector function
approximators capturing ST correlations was proposed in [11].
Recently, a CNN approach capturing ST correlations was
proposed in [5]. It uses masked-CNNs to parameterize the
predictive distribution, while a quantized travel-time is used
as CNN outputs.
[13] proposes a novel approach by combining CNNs and
RNNs in an interesting fashion. In particular spatial correla-
tions from the adjacent sections of the 1-D route are captured
by the convolutional layer, while the recurrent structure cap-
tures the temporal correlations. It employs a convolutional-
RNN based ED architecture to make multi-step predictions in
time. [14] considers an attention-based extension of [13]. [30]
employs a simplified RNN with attention but no state feedback
(even though weight sharing is present across time-steps). It
only captures single time-step predictions. A common feature
of all these RNN approaches is that the time axis is uniformly
partitioned into time bins of a fixed width.
A recent computationally interesting approach where BATP
is recast as a value function estimation problem under a suit-
ably constructed Markov reward process is proposed in [15].
This enables exploring a family of value-function predictors
using temporal-difference (TD) learning.
B. Related ED based RNN approaches
The ED architecture was first successfully proposed for
language translation applications[23], [24]. The proposed ar-
chitecture was relatively simple with the context from the last
time-step of the encoder fed as initial state and explicit input
for each time-step of the decoder. Over the years, machine
translation literature has seen intelligent improvements over
this base structure by employing attention layer, bidirectional
layer etc. in the encoder. Further, the ED framework has
been successfully applied in many other tasks like speech
recognition[31], image captioning etc.
Given the variable length Seq2Seq mapping ability, the ED
framework naturally can be utilized for multi-step (target)
time-series prediction where the raw data is real-valued and
target vector length can be independent of the input vector. An
attention based ED approach (with a bidirectional layer in the
encoder) for multi-step TS prediction was proposed in [22]
which could potentially capture seasonal correlations as well.
However, this architecture doesn’t consider exogenous inputs.
An approach to incorporate exogenous inputs into predictive
model was proposed in [21], where the exogenous inputs
in the forecast horizon are fed in a synchronized fashion at
the decoder steps. Our approach is close to the above TS
approaches.
C. Proposed approach in perspective of related approaches
From the prior discussion, one can summarize that many
existing approaches either fail to exploit historical data suffi-
ciently OR fail to capture spatial or temporal correlations. The
rest of the approaches do exploit spatio-temporal correlations
in different ways [9], [4], [11], [5], [13], [14], but suffer their
own drawbacks. For instance, [9] while exploits the previous
bus travel time at the current section (temporal correlation),
completely ignores when (time of day) the traversal happened.
The spatial correlation here comes from current bus travel
time of only one previous section. [4] (denoted as LNKF
in our experiments) addresses the issues of [9] as follows.
To better capture spatial correlations, it considers current bus
travel time measurements from multiple previous sections. The
temporal correlations here also take into account the previous
bus’s proximity by assuming a functional (parameterized) form
dependent on current section travel time and start time dif-
ference. It adopts a predominantly linear modelling approach
culminating in a Linear Kalman filter for prediction. As ex-
plained earlier, a support-vector based nonlinear generalization
of [4] is considered in [11] (referred to as SVKF in our
experiments). It learns the potentially non-linear spatial and
temporal correlations at a single-step level and then employs
an extended kalman filter for spatial multi-step prediction.
Compared to our non-linear ED (Seq2Seq) approach here, [4]
considers mainly a linear modelling. While [11] adopts a non-
linear modelling, the model training happens with single-step
targets in both [4], [11]. However, both these KF approaches
adopt a recursive sequential multi-step prediction which can
be prone to error accumulation. On the other hand, our ED
approach circumvents this issue of both these KFs by training
with vector targets where the predictions across all subsequent
sections are padded together into one target-vector.
CNN approach of [5] models travel time targets as categor-
ical values via a soft-max output layer. Hence it is sensitive
to the quantization level. A coarse quantization can lead to
high errors when the true target value is exactly between two
摘要:

IEEE1Encoder-DecoderRNNsforBusArrivalTimePredictionNancyBhutani§,SoumenPachal§,andAvinashAchar*§Abstract—Arrival/Traveltimesforpublictransitexhibitvari-abilityduetofactorslikeseasonality,trafficsignals,traveldemandfluctuationetc.Thedevelopingworldinparticularisplaguedbyadditionalfactorslikelackoflan...

展开>> 收起<<
IEEE 1 Encoder-Decoder RNNs for Bus Arrival Time Prediction.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.78MB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注