Bottleneck Analysis of Dynamic Graph
Neural Network Inference on CPU and GPU
Hanqiu Chen1, Yahya Alhinai1§, Yihan Jiang1§, Eunjee Na2§, Cong Hao1
{hchen799,yhinai3,yjiang400}@gatech.edu, jasmin7907@kaist.ac.kr, callie.hao@gatech.edu
1School of Electrical and Computer Engineering, Georgia Institute of Technology
2Korea Advanced Institute of Science & Technology
Abstract
Dynamic graph neural network (DGNN) is becoming
increasingly popular because of its widespread use in cap-
turing the dynamic features in the real world. A variety
of dynamic graph neural networks designed from algorith-
mic perspectives have succeeded in incorporating temporal
information into graph processing. Despite the promising
algorithmic performance, deploying DGNNs on hardware
presents additional challenges due to the model complexity,
diversity, and the nature of the time-dependency. Meanwhile,
the dierences between DGNNs and static graph neural
networks make hardware-related optimizations for static
graph neural networks unsuitable for DGNNs. In this paper,
we select eight prevailing DGNNs with dierent characteristics
and prole them on both CPU and GPU. The proling
results are summarized and analyzed, providing in-depth
insights into the bottlenecks of DGNNs on hardware and
identifying potential optimization opportunities for future
DGNN acceleration. Followed by a comprehensive survey, we
provide a detailed analysis of DGNN performance bottlenecks
on hardware, including temporal data dependency, workload
imbalance, data movement, and GPU warm-up. We suggest
several optimizations from both software and hardware
perspectives. This paper is the rst to provide an in-depth
analysis of the hardware performance of DGNN
∗
. Code is
available at https:// github.com/sharc-lab/ DGNN_analysis.
1. Introduction
Deep neural networks (DNNs) have made tremendous
breakthroughs in various domains such as speech [1–
3], image [4–7], and natural language processing [8–10].
While DNNs can eectively capture the hidden pattern in
Euclidean data, they do not perform well in processing
non-Euclidean data presented in graph format, such as
social networks, recommendation systems, and knowledge
graphs. Facing the challenges of DNNs, researchers have
an increasing interest in the graph data processing. Graph
Neural Network (GNN) is a powerful tool for processing
graphs. GNNs have shown the ability to capture the
∗
Hanqiu Chen proled EvolveGCN and TGAT models and led paper writing.
Yahya Alhinai proled JODIE, TGN, and MolDGNN models. Yihan Jiang proled
DyRep and LDG models. Eunjee proled ASTGNN model and prepared the codebase;
work done during her intern at Georgia Tech. §Equal contribution.
expressive power of graphs in many research areas. These
areas include social networks [11], physical systems [12]
and new drug discovery [13]. Fig. 1(a-b) demonstrates how
a social network can be represented as a graph, and how
GNNs can be applied to the graph. This is called static
GNN, in which both the graph and the model parameters
do not change.
Although GNNs have a strong representation power
on static graphs, in real-life, most of the graphs are
changing with time. For instance, as shown in Fig. 1 (c),
social networks are constantly growing as more people
are joining, and the edge weights are also changing as
the relationships between people evolve. In this case, the
graph topology changes over time, with nodes and edges
appearing and disappearing as time progresses. To better
represent complex graph structures changing over time,
dynamic graph neural network (DGNNs) is a promising
solution. Fig. 1(c) illustrates an example of a dynamically
changing graph, and Fig. 1(d) is an example of a dynamic
GNN, which uses a graph structure encoder to capture
the spatial information and a time encoder to encode the
temporal information.
According to a recent survey [14], DGNNs can be
divided into two categories according to the graph represen-
tation temporal granularity: (i) discrete time dynamic graph
neural network (DTDG) and (ii) continuous time dynamic
graph neural network (CTDG). DTDG captures the status
of a changing graph at dierent time steps and uses the
information from multiple snapshots of the dynamic graph
to gain a deeper understanding of the graph features. The
algorithm takes into account the temporal information to
better understand the features of the graph. CTDG is an
event-based neural network that can update node and edge
embeddings at any time when an edge between two nodes
appears. This update method is more realistic because it is
closer to real-world scenarios.
Although DGNNs have achieved great success from an
algorithmic perspective, their hardware performance is ex-
tremely poor due to the dire lack of hardware optimization.
Because of hardware bottlenecks, the lack of optimization
results in high latency and a low degree of parallelism.
Motivated by the software-hardware performance gap of
DGNNs, this paper seeks to provide a quantitative analysis
of eight representative models with dierent characteristics.