dependency to prevent the periodic information of different
nodes from interfering with each other.
To solve the above problems, we propose Spatial-Temporal
Graph Convolutional Gated Recurrent Network (STGCGRN)
to improve the accuracy of traffic forecasting. Our contribu-
tions are as follows:
•
We propose a new framework to better exploit temporal
and spatial dependency: We capture long-term temporal
dependency by explicitly introducing daily and weekly
periodic data, which are fed into the attention module
with a sliding window. The DGCGRU layer is designed
to extract the distance-based spatial dependency and mul-
tiple hidden spatial dependencies, which are captured by
the predefined distance-based adjacency matrix and the
multi-head self-adaptive adjacency matrix, respectively.
•
We explore the model structure and find that mining the
periodic information of a single node, and then performing
spatial dependency modeling can reduce the interference
of periodic information among different nodes.
•
We conduct experiments on four datasets. The results
show that our model outperforms the baseline methods.
We also conduct ablation experiments to evaluate the im-
pact of each component of our model on the performance.
Related Work
Graph Convolutional Neural Network
Graph convolutional neural network is a special form of CNN
generalized to non-Euclidean data, which can be applied
to different tasks and domains, such as node classification
(Kipf and Welling 2017), graph classification (Ying et al.
2018), link prediction (Zhang and Chen 2018), node cluster-
ing (Wang et al. 2017), spatial-temporal graph forecasting,
etc. Bruna et al.(Bruna et al. 2014) is the first work to gen-
eralize the convolution operation to non-Euclidean data, and
proposes a spatial method and a spectral method.
Spatial-based approaches aggregate neighborhood feature
informations in the spatial domain to extract a node’s high-
level representation. DCNNs(Atwood and Towsley 2016) ex-
tend convolutional neural networks (CNNs) to general graph-
structured data by introducing a ‘diffusion-convolution’ op-
eration. Message Passing Neural Network (MPNN)(Gilmer
et al. 2017) describes a general framework for supervised
learning on graphs, applying graph convolution to super-
vised learning of molecules. GraphSage(Hamilton, Ying, and
Leskovec 2017) proposes a general inductive framework to
efficiently generate node embeddings for previously unseen
data, sampling and aggregating features from the node’s local
neighborhood.
Spectral-based approaches implement the convolution op-
eration in the spectral domain with graph Fourier transforms.
ChebyNet(Defferrard, Bresson, and Vandergheynst 2016)
achieves fast localized spectral filtering by introducing C
order Chebyshev polynomials parametrization. GCN(Kipf
and Welling 2017) further reduces computational complexity
of ChebNet by limiting C=2.
Traffic Forecasting
As an important component of ITS, traffic forecasting has
received wide attention for decades. Most of the early traf-
fic forecasting works are based on some classic time se-
ries analysis models. ARIMA (Ahmed and Cook 1979) is
used to predict the short-term freeway traffic data. Subse-
quently, many variants of ARIMA are applied to this task,
such as KARIMA (Van Der Voort, Dougherty, and Watson
1996), subset ARIMA (Lee and b. Fambro 1999), ARIMAX
(Williams 2001), etc. Traffic data contains complex spatial-
temporal dependency. However, the aforementioned works
only consider the linear relationship in the temporal dimen-
sion, which is obviously insufficient. Researchers apply tra-
ditional machine learning algorithms to traffic forecasting,
including SVR (Wu, Ho, and Lee 2004), KNN (Davis and
Nihan 1991), and so on. These methods need handcrafted
features, which require expert experience in related fields
Due to the advantages of deep learning in solving nonlin-
ear problems and automatically extracting features, scholars
pay more attention to deep learning. RNN (Fu, Zhang, and Li
2016; Yu et al. 2017) and CNN (Zhang, Zheng, and Qi 2017;
Yu, Yin, and Zhu 2018; Wu et al. 2019) are adopted to capture
temporal dependency. For example, Graph Wavenet (Wu et al.
2019) uses dilated casual convolutions to model the tempo-
ral dependency to increase the receptive field. To model the
spatial dependency, Some works (Fang et al. 2019; Lin et al.
2019; Yao et al. 2019a) divide the traffic data into grid-based
map, and extract spatial information by CNN. However, due
to natural limitations, CNN can not handle non-Euclidean
data. For this reason, researchers apply GCN to the field
of traffic forecasting. DCRNN (Li et al. 2018) captures the
spatial dependency using bidirectional random walks on the
graph. STGCN (Yu, Yin, and Zhu 2018) employs a gener-
alization of Chebnet to capture the spatial dependency of
traffic data. These methods only utilize the predefined graph
structures, which is not sufficient. Some works (Wu et al.
2019; Wang et al. 2020; Bai et al. 2020) calculate similari-
ties among learnable node embeddings to capture the hidden
spatial dependency in the data. DGCRN (Li et al. 2021a)
filters the node embeddings and then uses them to generate
dynamic graph at each time step. However, these works do
not adequately model spatial dependency and ignore periodic
information.
Preliminaries
•
Definition 1: Traffic Network. The traffic network is an
undirected graph
G= (V, E, A)
, where
V
is a set consists
of
N
nodes,
E
is a set of edges and
A∈RN×N
is the
adjacency matrix representing the nodes distance-based
proximity.
•
Definition 2: Traffic Signal Matrix. Traffic signal matrix
can be denoted as a tensor
Xt∈RN×C
, where
C
is the
number of traffic features of each node (e.g., the speed,
volume), tdenotes the time step.
The problem of traffic forecasting can be described as:
given the traffic network
G= (V, E, A)
and
P
steps traf-
fic signal matrix
X(t−P):t= (Xt−P, Xt−P+1, ..., Xt−1)∈
RP×N×C
, learning a function
f
which maps
X(t−P):t