cannot directly aggregate information from other graphs. As
such, we are looking for an approach tailored to graph-level
clustering that can well learn clustering-friendly represen-
tations, and meanwhile capture discriminative clustering in-
formation from neighboring graphs.
Towards this end, this work proposes a general frame-
work called Graph-Level Contrastive Clustering (GLCC)
on multiple graphs. The key idea of GLCC is to exploit
the multi-granularity information to provide a global char-
acterization of graph instances. To achieve this goal effec-
tively, GLCC first constructs an adaptive affinity graph to in-
corporate neighbor knowledge, we then introduce two-level
contrastive learning (CL) based on the affinity graph, i.e.,
an instance-level CL and a cluster-level CL, respectively.
On the one hand, instance-level CL leverages graph Lapla-
cian based contrastive loss to learn clustering-friendly rep-
resentations for effective cluster assignments. On the other
hand, cluster-level CL incorporates neighbor information
of each sample to capture compact cluster representations
to achieve cluster-level consistency. Furthermore, neighbor-
aware pseudo-labels are generated to feed back the training
of representation learning, so that the clustering and repre-
sentation learning can be alternatively optimized to coopera-
tively supervise and mutually enhance each other. By incor-
porating this multi-granularity information, our experiments
show that it can largely improve the existing state-of-the-art
approaches on multiple real-life datasets. To summarize, the
main contributions of this work are as follows:
•General Aspects: To the best of our knowledge, this
could be the first work to investigate deep graph-level
clustering, which explores graph-level clustering on mul-
tiple graphs, different from existing works in studying
clustering among nodes given a single graph.
•Novel Methodologies: We propose a general framework
to explore instance- and cluster-level contrastive learning
based on the affinity graph. Instance-level CL aims to
learn clustering-friendly representations, while cluster-
level CL captures discriminative cluster representations.
•Multifaceted Experiments: We conduct comprehensive
experiments on various well-known datasets to demon-
strate the effectiveness of the proposed approach.
Related Work
Graph Neural Networks. GNNs are originally introduced
by (Gori, Monfardini, and Scarselli 2005) and are a typi-
cal class of deep neural networks that combine the topolog-
ical structure and associated features of a graph, thus pos-
sessing the powerful capability to process graph-structured
data. The basic idea is to learn the low-dimensional graph
representations through a recursive neighborhood aggrega-
tion scheme (Gilmer et al. 2017; Ju et al. 2022b; Luo et al.
2022a). The derived graph representations can be used to
serve various downstream tasks, such as node classifica-
tion (Kipf and Welling 2017), graph classification (Ju et al.
2022a), and graph clustering (Bo et al. 2020).
Deep Clustering. Our work is related to deep cluster-
ing, which has achieved impressive performance, benefit-
ing from the breakthroughs in deep learning. There has
been a surge of interest in employing deep neural net-
works to enhance clustering, which can be divided into two
main categories: (i) reconstruction based methods, and (ii)
self-augmentation based methods. For the first category, it
aims to leverage the auto-encoder (Rumelhart, Hinton, and
Williams 1985) paradigm to impose desired constraints on
feature learning in the latent space. For example, Deep Em-
bedded Clustering (DEC) (Xie, Girshick, and Farhadi 2016)
simultaneously learns feature representations and cluster as-
signments by minimizing the Kullback-Leibler divergence.
IDEC (Guo et al. 2017) improves the clustering by preserv-
ing the local structure of data generating distribution. For the
second category, the underlying concept is training the net-
works to achieve the consistency between original samples
and their augmented samples. For instance, IIC (Ji, Hen-
riques, and Vedaldi 2019) maximizes the mutual informa-
tion of paired dataset samples to keep a consistent assign-
ment probability. However, these methods are not tailored
to graph-level clustering, and show an inability to process
complex data structures, such as graph domains.
Graph Clustering. Another category of related work is
graph clustering. Benefiting from the strong capability of
GNNs in incorporating both node attributes and graph struc-
tures, GNNs have emerged as a powerful approach for graph
clustering, which aims to reveal the underlying graph struc-
ture and divides the nodes into several disjoint groups.
Similarly, most existing graph clustering approaches (Wang
et al. 2017, 2019; Pan et al. 2019; Fan et al. 2020; Bo
et al. 2020) also follow the framework of auto-encoder, in
which the graph auto-encoder (GAE) and the variational
graph auto-encoder (VGAE) are used to learn the graph-
structured data. For example, DAEGC (Wang et al. 2019)
utilizes an attention network to capture the importance of
the neighboring nodes, and further encodes the topologi-
cal structures and node contents to a compact representa-
tion based on GAE. The adversarially regularized graph au-
toencoder (ARGA) (Pan et al. 2019) enhances the cluster-
ing via introducing an adversarial learning scheme to learn
the graph embedding. Compared with existing methods for
node-level clustering, our work goes further and studies an
under-explored yet important graph-level clustering.
Problem Definition & Preliminary
Definition: Graph-level Clustering. A graph is denoted as
a topological graph G= (V,E,X), where Vis the set of
nodes, Eis the set of edges. We use X∈R|V|×dto de-
note the node feature matrix, where dis the dimension of
features. Let G={G1,··· , GN}denote a set of unlabeled
graphs from Kdifferent categories. The goal of graph-level
clustering is to separate these graphs into Kdifferent clus-
ters such that the graphs with the same semantic labels can
be grouped into the same cluster.
Graph Neural Networks
We build upon graph neural networks (GNNs) to learn ef-
fective graph-level representations. GNNs are powerful ar-
chitectures by iteratively aggregating over local node neigh-
borhoods via message-passing mechanisms (Gilmer et al.