
a GNN model is treated as an unknown, a purified graph structure with a parameterized form will
be “learned” through optimizing the supervised GNN training loss (Zhu et al., 2022). For example,
in Franceschi et al. (2019), the graph adjacency matrix is directly learned with a GNN in a bilevel
optimization way, where a full parametrization of the graph adjacency matrix is adopted. Moreover,
under this full parametrization setting, structural regularizers are adopted in Jin et al. (2020); Luo et al.
(2021) as augmentations on the training loss function to promote certain properties of the purified
graph. Besides the full parametrization approach, a multi-head weighted cosine similarity metric
function (Chen et al., 2020) and a GNN model (Yu et al., 2020) have also been used to parameterize
the graph adjacency matrix for structure learning.
Going beyond purifying the graph structures to robustify the GNN models, there are also efforts on
designing robust GNN architectures via directly designing the feature aggregation schemes. Under
the observation that aggregation functions such as sum, weighted mean, or the max operations can be
arbitrarily distorted by only a single outlier node, Geisler et al. (2020); Wang et al. (2020); Zhang and
Lu (2020) try to design robust GNN models via designing robust aggregation functions. Moreover,
some works apply the attention mechanism (Veliˇ
ckovi´
c et al., 2018) to mitigate the influence of
adversarial perturbations. For example, Zhu et al. (2019) consider the node feature following a
Gaussian distribution and use the variance information to determine the attention scores. Tang et al.
(2020) use clean graph information and their adversarial counterparts to train an attention mechanism
to learn to assign small attention scores to the perturbed edges. In Zhang and Zitnik (2020), the
authors define an attention mechanism based on the similarity of neighboring nodes.
Different from existing approaches to robustify GNNs, in this work, we propose a novel robust and
interpretable message passing scheme with adaptive structure (ASMP). Based on ASMP, a family
of GNN models with adaptive structure (ASGNN) can be designed. Prior works have revealed
that the message passing processes in a class of GNNs are actually (unrolled) gradient steps for
solving a graph signal denoising (GSD) problem (Zhu et al., 2021; Ma et al., 2021; Zhang and Zhao,
2022). ASMP is actually generated by an alternating (proximal) gradient descent algorithm for
simultaneously denoising the graph signal and the graph structure. Designed in such a principled
way, ASMP is not only friendly to back-propagation training but also achieves the desired structure
adaptivity with a theoretical convergence guarantee. Once trained, ASMP can be naturally interpreted
as a parameter-optimized iterative algorithm. This work falls into the category of GNN architecture
designs. Conceptually different from the existing robustified GNNs with fixed graph structure,
ASGNN interweaves the graph purification process and the message passing process, which makes
it possible to conduct message passing over different graph structures at different layers, i.e., in an
adaptive graph structure fashion. Thus, an edge might be excluded in some layers but included
in other layers, depending on the dynamic structure learning process. Such property allows more
fine-grained handling of perturbations than existing graph purification methods that use a single graph
in the entire GNN. To be more specific, the major contributions of this work are highlighted in the
following.
•
We propose a novel message passing scheme over graphs called ASMP with convergence
guarantee and specifications. To the best of our knowledge, ASMP is the first message
passing scheme with adaptive structure that is designed based on an optimization problem.
•
Based on ASMP, a family of GNN models with adaptive structure, named ASGNN, are
further introduced. The adaptive structure in ASGNN allows more fine-grained handling of
noisy graph structures and strengthens the model robustness against adversarial attacks.
•
Extensive experiments under various adversarial attack scenarios showcase the superiority of
the proposed ASGNN. The numerical results corroborate that the adaptive structure property
inherited in ASGNN can help mitigate the impact of perturbed graph structure.
2 PRELIMINARIES AND BACKGROUND
An unweighted graph with self-loops is denoted as
G= (V,E)
, where
V
and
E
denote the node set and
the edge set, respectively. The graph adjacency matrix is given by
A∈RN×N
. We denote by
1
and
I
the all-one column vector and the identity matrix, respectively. Given
D= Diag (A1)∈RN×N
as the diagonal degree matrix, the Laplacian matrix is defined as
L=D−A
. We denote by
Arw =
D−1A
the random walk (or row-wise) normalized adjacency matrix and by
Asym =D−1
2AD−1
2
the
2