would introduce additional computation overhead, which con-
tradicts our original objective. Thus, decomposing the update
process and model depth remains a crucial point to ensure
efficiency in learning from high-order information in dynamic
graphs.
While leveraging high-order information has shown to
be useful in GNNs, we still encounter another essential
problem—over-smoothing. Over-smoothing refers to the phe-
nomenon where increasing the model depth smoothens the
node embeddings, rendering nodes indistinguishable and lead-
ing to a decline in performance. This problem is contradictory
to our intuition that deeper models can provide better expres-
siveness. Over-smoothing occurs due to the large receptive
field of deep models, which covers all nodes in the network.
As a result, the learned node representations fail to focus
on the low-order neighbors that might provide more dis-
criminative information, leading to similar node embeddings.
Consequently, the downstream classifier, which predicts labels
based on node embeddings, misclassifies nodes with similar
representations yet different labels. Thus, alleviating the over-
smoothing problem by emphasizing the importance of low-
order neighbors is essential for learning robust and discrim-
inative node representations in dynamic graphs, particularly
when incorporating high-order information.
To address the aforementioned deficiencies in CTDG em-
bedding, we propose a novel temporal propagation-based
graph neural network (TPGNN). Inspired by APAN [16],
TPGNN leverages two key components to learn node repre-
sentations. The propagator propagates messages from anchor
nodes to their k-hop neighbors and updates the node state
on neighborhoods in parallel. The node-wise encoder incor-
porates a layer-aware transformer to model the importance of
messages from different layers, mitigating the over-smoothing
problem by identifying the most influential layer. Additionally,
the node-wise encoder aggregates node representations based
on node-preserving memories, allowing for decoupling of
inference time and model depth. Our extensive experiments
on three real-world datasets demonstrate the effectiveness
and robustness of TPGNN, showing superior performance
compared to SOTA algorithms. We highlight our contributions
as follows:
•We propose a novel temporal propagation-based graph
neural network (TPGNN) to effectively and efficiently
learn high-order information in dynamic graphs as well
as address the issue of over-smoothing.
•We propose two distinct components, namely propagator
and node-wise encoder. The propagator is used to update
the state of neighborhoods in different layers in parallel,
and the node-wise encoder is designed to aggregate node
representation based on node-preserving memories to
prevent over-smoothing.
•We conduct experiments on three real-world datasets to
demonstrate the efficiency and robustness of TPGNN.
Extensive experiments also demonstrate that our model is
not sensitive to crucial hyper-parameters such as model
depth and batch size.
II. RELATED WORK
In recent years, graph representation learning has gained
significant attention, particularly for static graphs [6]–[8],
[17]–[20]. To extend these algorithms to continuously evolving
scenarios, researchers have focused on developing temporal
graph representation learning methods. One intuitive approach
is to split a temporal graph into a sequence of chronological
snapshots, and various methods have been proposed to cap-
ture the evolving information across all snapshots, including
matrix factorization [21], triadic closure process [22], and
random walk [23]. Recurrent neural networks (RNNs) [24]
and transformers [1] have also been utilized to model the
chronologically sequential effect across all snapshots to en-
hance expressiveness. For example, DynGEN [25] updates
snapshot representations based on the node representations
of the previous snapshot, while DySAT [26] proposes a
hierarchical attention mechanism to preserve structural and
temporal properties to overcome the time dependency inherent
in RNNs.
The previous methods for temporal graph representation
learning have mostly focused on discrete approaches, where
the temporal information is captured by dividing a temporal
graph into a series of snapshots. However, this approach over-
looks the dynamics within each snapshot. To address this limi-
tation, some recent works aim to learn the temporal processing
of continuously evolving events. For instance, DyRep [27] and
its variants [28], [29] use temporal Hawkes processes, while
CTDNE [30], FiGTNE [31], and CAW-N [11] employ time-
respect random walks, time-reinforced random walks, and
causal anonymous walks, respectively. GNN-based methods
[12], [32], [33] leverage time encoding to preserve temporal
information. TGAT [9] uses a GAT-like [8] architecture with
continuous time encoding to encode node representations,
TGN [10] leverages GRU [34] to update node representations,
and HVGNN [35] extends temporal GNN to hyperbolic space.
However, these methods generally aggregate messages from 1-
hop or 2-hop neighbors, failing to model high-order informa-
tion. HIT [28] proposes to construct temporal hyper-graphs to
predict the high-order pattern, introducing extra computational
consumption. APAN [16] reverses the direction of message
passing to reduce inference time. In particular, the messages
are propagated from the anchor node to multi-hop neighbors
and each node preserves a queue mailbox. Our proposed
TPGNN aims to overcome the limitations of previous methods
by effectively and efficiently learning high-order information
while mitigating over-smoothing.
III. PROPOSED MODEL
In this paper, we present the TPGNN model, which consists
of the propagator and node-wise encoder, to efficiently gain
knowledge from high-order neighbors. The overview architec-
ture of TPGNN is illustrated in Figure 1.
A. Preliminary
A CTDG G={V,E} consists of a node set Vand a time
sensitive edge set E. The edge set is represented as a series of