IEEE 1 Encoder-Decoder RNNs for Bus Arrival Time Prediction

2025-05-08 0 0 1.78MB 15 页 10玖币

侵权投诉

IEEE 1

Encoder-Decoder RNNs for Bus Arrival Time

Prediction

Nancy Bhutani§, Soumen Pachal§, and Avinash Achar*§

Abstract—Arrival/Travel times for public transit exhibit vari-

ability due to factors like seasonality, trafﬁc signals, travel

demand ﬂuctuation etc. The developing world in particular is

plagued by additional factors like lack of lane discipline, excess

vehicles, diverse modes of transport, unreliable schedules etc.

This renders the bus arrival time prediction (BATP) to be a

challenging problem especially in the developing world. A data-

driven model based on a novel variant of Encoder-Decoder

(OR Seq2Seq) recurrent neural networks (RNNs) is proposed

for BATP (in real-time). The model intelligently incorporates

spatio-temporal (ST) correlations in a unique (non-linear) fashion

distinct from existing approaches. Existing Encoder-Decoder

(ED) approaches for BATP blindly map time to the sequential

aspect of ED, while ignoring crucial data characteristics and

making some restrictive model assumptions. Our approach in

contrast is not straightforward and effectively tackles these issues.

We exploit the geometry of the dynamic real-time BATP problem

to enable a novel ﬁt with the ED structure, distinct from existing

ED approaches. Further motivated from accurately modelling

past congestion inﬂuences from downstream sections, we addi-

tionally propose a bidirectional layer at the decoder (something

unexplored in other time-series based ED application contexts).

The effectiveness of the proposed architecture is demonstrated

on real ﬁeld data collected from challenging trafﬁc conditions,

while bench-marking against state-of-art baselines. The proposed

architecture is not limited to transportation, but can also be

employed for multi-step time-series forecasting (sales/demand

forecasting under exogenous inputs like price).

Index Terms—Encoder-Decoder, Nonlinear Predictive Mod-

elling, Recurrent Neural Networks, Travel-Time Prediction.

I. INTRODUCTION

Public transit system is a crucial component to adminis-

ter the overall transport system in urban cities across the

world. Having a quality system would make it attractive for

commuters and can in-turn mitigate mounting trafﬁc volumes

and congestion levels, which is a universal problem across

the urban world. A quality system would entail sticking to

well-designed schedules to the extent feasible while providing

quality predictions in real-time. Such accurate estimates can

help commuters better plan their bus-stop arrival while avoid

unnecessary wait times. Quality Bus travel time predictions

can also assist commuters decide between taking a bus or some

other alternate mode of transport. Quality BATP estimates can

also beneﬁt travel administrators take corrective action when

bus schedules are violated.

* Avinash Achar is the corresponding author.

§All authors contributed equally.

Nancy Bhutani, Soumen Pachal, Avinash Achar are with TCS Research,

Chennai, INDIA. E-mail: {nancy.9,s.pachal,achar.avinash}@tcs.com.

The manuscript was ﬁrst submitted in Aug 2023 for review.

This paragraph will include the Associate Editor who handled your paper.

BATP literature is more than one and a half decades old. It

continues to be a challenging research problem in the devel-

oping world, in particular. Factors contributing to this include

(1) absence of lane discipline (2) in-homogeneity of trafﬁc

(i.e. transport modes can include bicycles, two wheelers, four

wheelers, heavy vehicles like trucks, buses and so on) with

dedicated lanes absent for speciﬁc modes of transport. We refer

to this as mixed trafﬁc conditions. Another issue is the lack of

reliability of bus schedules [42], especially in the context of

India where most cities experience mixed trafﬁc conditions.

The bus schedules even if available tend to get outdated due

to constant changes in trafﬁc conditions and infrastructure.

This makes timetables extremely unreliable, leading to ad-hoc

waiting times for passengers. Hence providing accurate arrival

time predictions become even more important in such cases.

The real data considered in this paper is from a bus route

in Delhi [42], the capital of India which experiences mixed-

trafﬁc conditions. Any google-map based bus arrival time

query (in most cities in India including Delhi) has till recently

mostly returned a constant prediction independent of the date

or time of query. These constant estimates seem to be based

on some pre-ﬁxed schedules which are unfollowable given the

complex trafﬁc conditions as explained above. Further, ETA

(Expected time of arrival) solutions are still being continuously

improved upon by Google at a network level across the world

for different modes of urban transport [1], [2]. On account of

the above factors, BATP continues to be a challenging research

problem [3] especially under mixed trafﬁc conditions [4], [5].

Over the years, there have been diverse approaches proposed

for BATP. Data-driven approaches have been a dominant class

of methods for BATP.

In most of these approaches, an entire route is segmented

into smaller sections(or segments) either uniformly [6] OR

into non-uniform segments connecting consecutive bus-stops.

Depending on the method and the installed sensing infrastruc-

ture, the data input can be entities like speed, density, ﬂow,

travel time etc. In this work, we consider scenarios where

input data comes only from travel times experienced across

these segments/sections. AVL (automatic vehicle location) data

captured by GPS sensing can easily provide such travel times.

Gaps and Contributions: Over the years, researchers have

explored a wide spectrum of methods under the broad umbrella

of data-driven approaches. These include ARIMA models [7],

linear statistical models like Kalman ﬁlters [4], [8], support

vector machines [9], [10], [11], feed-forward ANNs [12], [2],

recurrent neural networks [13], [14], CNNs [5], temporal dif-

ference learning[15] and so on. Most of the existing methods

suffer from a range of issues like (i)insufﬁcient utilization

arXiv:2210.01655v2 [cs.LG] 31 Aug 2024

IEEE 2

of historical data for model calibration [16], [17], [6] OR

(ii)ignoring spatial correlations [8], [18], [10] (iii) not exploit-

ing temporal correlations([12], [19]) (iv) not exploiting real-

time information enough([20]), (v)segment the time-axis into

uniform bins, which can lead to inaccurate predictions[13],

[14]. There has been some recent work on exploiting spatio-

temporal correlations [4], [11], [5], [13], [14] as well in this

direction.

From an RNN literature perspective, there’s been re-

cent work where people have explored ED (also known as

Seq2Seq) architectures for real-valued data [21], [22] (time-

series (TS) in particular). A natural way to employ Seq2Seq

for BATP would be to segement the time-axis into uniform

bins [13], [14] and learn a sequential model in time. This

approach ignores the continuous nature of data in the temporal

dimension and makes an unrealistic assumption that section

travel times are constant across time bins (see Remark 2).

Our approach intelligently addresses this drawback while also

respecting the temporal continuity and space-time asymmetry,

in BATP data. Further, proposed approach is not evident due

to difference in structure of available data between BATP and

traditional time-series.

The current work proposes a novel ED architecture (differ-

ent from classical machine translation architecture OR existing

ED approaches for time-series) which is also distinct from all

existing BATP approaches (in particular from ED based BATP

approaches also [13], [14]). It exploits current real-time spatio-

temporal correlations and historical seasonal correlations for

nonlinear modelling and prediction. Speciﬁcally our contribu-

tions are as follows.

•We intelligently recast the real-time BATP problem to a

novel and unique variant of Encoder-Decoder architec-

ture with real-time spatio-temporal inputs and historical

seasonal inputs carefully placed in the architecture. The

key is in recognizing that BATP inherently involves se-

quential training data with variable length input-output

pairs, which ED framework can handle. The proposed

ED model’s sequential aspect is mapped to the spatial

dimension of BATP, while the temporal aspects of BATP

are captured by feeding the associated inputs as decoder

inputs in a space synchronized fashion.

•Travel times from the just traversed sections of current

bus (constitute the spatial correlations) are fed as inputs

to the encoder. While real-time information coming from

closest previous bus’s travel times across subsequent

sections (constituting temporal correlations) are fed in

a synchronized sequential fashion into the decoder as

additional inputs. Note that these synchronised inputs at

the decoder are absent in the classic ED application

for machine translation [23], [24]. The section travel

times (of current bus) across subsequent sections are the

prediction targets which are neatly mapped to the decoder

outputs sequentially. Weekly seasonal correlations are

also incorporated via additional inputs from the closest

trip of the previous week.

•We propose a bidirectional layer at the decoder as this

can now capture (for a given section) the possible (up-

stream) inﬂuence of past congestions (in time) from sub-

sequent sections, propagating backward in space (along

the bus route). This novel feature of our ED variant is

unexplored in both traditional ED and other time-series

based variants of ED to the best of our knowledge.

•Effectiveness of the proposed approach is illustrated on

actual ﬁeld data collected from a route in Delhi, where

mixed trafﬁc condition are very common. The results

clearly demonstrate superior performance of our approach

(across sub-routes of diverse lengths) in comparison to 4

recent state-of-art baselines.

The rest of the paper is organized as follows. Sec. II de-

scribes the related work in detail both from the perspective of

(i) BATP literature and (ii) ED based RNN approaches. Sec. III

explains the technical contribution in detail. In particular it

describes the proposed architecture and technically motivates

how the architecture can be derived to solve BATP. Sec. IV

demonstrates the effectiveness of the proposed architecture on

one route from Delhi by bench-marking against four carefully

chosen state-of-art baselines. We provide (i) a brief discussion

as to how our proposed architecture can be also be used in

other applications and (ii) concluding remarks in Sec. V

II. LITERATURE REVIEW AND RELATED WORK

The BATP literature has not only seen diversity in the range

of techniques used, but also in the kind of data input employed

for prediction. A range of data inputs like speed, travel times,

weather, ﬂow information, crowd-sourced data[26], scheduled

time tables [25] and so on have been considered for BATP.

One can broadly categorize the range of approaches into

two classes: (i) trafﬁc-theory based (ii) data-driven. Given

the proposed approach is data-driven, we stick to reviewing

related data-driven approaches. While Sec. II-A describes data-

driven approaches for BATP, Sec. II-B discusses related ED

based RNN approaches. Finally, Sec. II-C places the proposed

architecture in perspective of all related work.

A. Data-driven methods for BATP

Unlike the trafﬁc-theory based approaches which model

the physics of the trafﬁc, data-driven methods employ a

coarse model (based on measurable entities) that is sufﬁcient

for predictive purposes. Most approaches learn to estimate

necessary parameters of a suitable predictive model from past

historical data, which is further employed for real-time BATP.

There are a few approaches which employ a data-based model

but do not perform full-ﬂedged learning based on historical

data.

Without Learning: One of the earliest approaches without

learning was proposed in [27] using a Kalman ﬁlter. As inputs,

it used previous bus travel times and travel times from previous

day (same time). It has an arbitrary choice of parameter in its

state space model while only capturing temporal dependencies.

The subsequent approaches consider a linear state-space model

involving travel times and calibrate (or ﬁx) the data-based

model parameters in real time. They choose their parameters

either based on (a) data from previous bus [16], [6] or (b)an

appropriate optimal travel-time data vector from the historical

data-base [28].

IEEE 3

Explicit Learning: As mentioned in the introduction, there

are a variety of approaches learning from historical data.

For instance, support vector regression [10] and feed-forward

ANNs [18] were employed to capture temporal correlations via

multiple previous bus travel times. Employing link length (a

static input) and rate of road usage and speed (dynamic inputs),

[20] proposes an SVR based prediction. However current bus

position OR previous bus inputs are not considered there.

A speed based prediction scheme is proposed in [29] which

uses a weighted average of current bus speed and historically

averaged section speed as inputs. As previous method, it

ignores information from previous bus. A dynamic SVR based

prediction scheme is proposed in [9] which exploits spatio-

temporal (ST) correlations in a minimal manner. In particular,

it considers current bus travel time at the previous section and

previous bus travel time at the current section.

A single feed-forward ANN model is built to predict travel

times between any two bus stops on the route in [12]. On

account of this, target travel time variable’s dynamic range

would be very large and can lead to poor predictions for very

short and very long routes. An approach using (non-stationary)

linear statistical models which captures ST correlations was

proposed in [4]. It uses a linear kalman ﬁlter for prediction.

Linear models here are used to capture spatial correlations.

The temporal correlations come from the (currently plying)

previous bus section travel time. Another approach using

linear statistical models and exploiting real-time temporal

correlations (from previous buses) was proposed in [8]. A

nonlinear generalization of [4] using support vector function

approximators capturing ST correlations was proposed in [11].

Recently, a CNN approach capturing ST correlations was

proposed in [5]. It uses masked-CNNs to parameterize the

predictive distribution, while a quantized travel-time is used

as CNN outputs.

[13] proposes a novel approach by combining CNNs and

RNNs in an interesting fashion. In particular spatial correla-

tions from the adjacent sections of the 1-D route are captured

by the convolutional layer, while the recurrent structure cap-

tures the temporal correlations. It employs a convolutional-

RNN based ED architecture to make multi-step predictions in

time. [14] considers an attention-based extension of [13]. [30]

employs a simpliﬁed RNN with attention but no state feedback

(even though weight sharing is present across time-steps). It

only captures single time-step predictions. A common feature

of all these RNN approaches is that the time axis is uniformly

partitioned into time bins of a ﬁxed width.

A recent computationally interesting approach where BATP

is recast as a value function estimation problem under a suit-

ably constructed Markov reward process is proposed in [15].

This enables exploring a family of value-function predictors

using temporal-difference (TD) learning.

B. Related ED based RNN approaches

The ED architecture was ﬁrst successfully proposed for

language translation applications[23], [24]. The proposed ar-

chitecture was relatively simple with the context from the last

time-step of the encoder fed as initial state and explicit input

for each time-step of the decoder. Over the years, machine

translation literature has seen intelligent improvements over

this base structure by employing attention layer, bidirectional

layer etc. in the encoder. Further, the ED framework has

been successfully applied in many other tasks like speech

recognition[31], image captioning etc.

Given the variable length Seq2Seq mapping ability, the ED

framework naturally can be utilized for multi-step (target)

time-series prediction where the raw data is real-valued and

target vector length can be independent of the input vector. An

attention based ED approach (with a bidirectional layer in the

encoder) for multi-step TS prediction was proposed in [22]

which could potentially capture seasonal correlations as well.

However, this architecture doesn’t consider exogenous inputs.

An approach to incorporate exogenous inputs into predictive

model was proposed in [21], where the exogenous inputs

in the forecast horizon are fed in a synchronized fashion at

the decoder steps. Our approach is close to the above TS

approaches.

C. Proposed approach in perspective of related approaches

From the prior discussion, one can summarize that many

existing approaches either fail to exploit historical data sufﬁ-

ciently OR fail to capture spatial or temporal correlations. The

rest of the approaches do exploit spatio-temporal correlations

in different ways [9], [4], [11], [5], [13], [14], but suffer their

own drawbacks. For instance, [9] while exploits the previous

bus travel time at the current section (temporal correlation),

completely ignores when (time of day) the traversal happened.

The spatial correlation here comes from current bus travel

time of only one previous section. [4] (denoted as LNKF

in our experiments) addresses the issues of [9] as follows.

To better capture spatial correlations, it considers current bus

travel time measurements from multiple previous sections. The

temporal correlations here also take into account the previous

bus’s proximity by assuming a functional (parameterized) form

dependent on current section travel time and start time dif-

ference. It adopts a predominantly linear modelling approach

culminating in a Linear Kalman ﬁlter for prediction. As ex-

plained earlier, a support-vector based nonlinear generalization

of [4] is considered in [11] (referred to as SVKF in our

experiments). It learns the potentially non-linear spatial and

temporal correlations at a single-step level and then employs

an extended kalman ﬁlter for spatial multi-step prediction.

Compared to our non-linear ED (Seq2Seq) approach here, [4]

considers mainly a linear modelling. While [11] adopts a non-

linear modelling, the model training happens with single-step

targets in both [4], [11]. However, both these KF approaches

adopt a recursive sequential multi-step prediction which can

be prone to error accumulation. On the other hand, our ED

approach circumvents this issue of both these KFs by training

with vector targets where the predictions across all subsequent

sections are padded together into one target-vector.

CNN approach of [5] models travel time targets as categor-

ical values via a soft-max output layer. Hence it is sensitive

to the quantization level. A coarse quantization can lead to

high errors when the true target value is exactly between two

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

IEEE1Encoder-DecoderRNNsforBusArrivalTimePredictionNancyBhutani§,SoumenPachal§,andAvinashAchar*§Abstract—Arrival/Traveltimesforpublictransitexhibitvari-abilityduetofactorslikeseasonality,trafficsignals,traveldemandfluctuationetc.Thedevelopingworldinparticularisplaguedbyadditionalfactorslikelackoflan...

展开>> 收起<<

IEEE 1 Encoder-Decoder RNNs for Bus Arrival Time Prediction.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

IEEE 1 Encoder-Decoder RNNs for Bus Arrival Time Prediction

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: