Generalized energy and gradient flow via graph framelets Andi HanDai ShiZhiqi ShaoJunbin Gao

2025-05-06 0 0 626.99KB 18 页 10玖币
侵权投诉
Generalized energy and gradient flow
via graph framelets
Andi HanDai ShiZhiqi ShaoJunbin Gao
Abstract
In this work, we provide a theoretical understanding of the framelet-based graph neural
networks through the perspective of energy gradient flow. By viewing the framelet-based models
as discretized gradient flows of some energy, we show it can induce both low-frequency and
high-frequency-dominated dynamics, via the separate weight matrices for different frequency
components. This substantiates its good empirical performance on both homophilic and het-
erophilic graphs. We then propose a generalized energy via framelet decomposition and show its
gradient flow leads to a novel graph neural network, which includes many existing models as
special cases. We then explain how the proposed model generally leads to more flexible dynamics,
thus potentially enhancing the representation power of graph neural networks.
1 Introduction
Graph neural networks (GNNs) [
20
,
22
,
28
,
42
,
46
] have become the primary tool for representation
learning over graph-structured data, such as social networks [
9
] citation networks [
28
], molecules
[
18
], traffic networks [
14
], among others. There generally exists two types of GNN models, i.e.,
spatial and spectral based models. Spatial GNNs, including MPNN [
21
], GAT [
42
], GIN [
49
], usually
propagate information in the neighbourhood and update node representations via a weighted average
of the neighbours. Spectral GNNs, including ChebyNet [
15
], GCN [
28
], BernNet [
26
], performs
filtering on the spectral domain provided by graph Fourier transform (where the orthonormal system
is given by the eigenvectors of graph Laplacian). Wavelet-based graph representation learning
[
17
,
23
,
31
,
43
,
48
,
51
,
52
,
53
,
58
], a class of spectral methods, provide a multi-resolution analysis
of graph signals and thus often lead to better signal representations by capturing information at
different scales. In particular, graph framelets [
17
,
53
], a type of tight wavelet frame, further allows
separate modelling of low-pass and high-pass signal information, and has been considered to define
graph convolution as in [
8
,
51
,
53
]. It has been shown that the graph framelet convolution allows
more flexible control of approximate (via low-pass filters) and detailed information (via high-pass
filters) with great robustness and efficiency, achieving the state-of-the-art results on multiple graph
learning tasks [8, 50, 51, 55, 56, 58].
Along with the developments of more advanced models, theoretical understanding of both the
power and pitfalls of graph neural networks has called for great attention. Currently, there exists
many known limitations of GNNs, including over-smoothing [
5
,
30
], over-squashing [
1
,
41
], limited
expressive power [
33
,
49
] and poor performance on heterophilic graphs [
37
,
57
]. In an attempt
University of Sydney (andi.han@sydney.edu.au,zsha2911@uni.sydney.edu.au,junbin.gao@sydney.edu.au)
Western Sydney University (dai.shi@sydney.edu.au)
1
arXiv:2210.04124v1 [cs.LG] 8 Oct 2022
to better resolve the above issues, many studies have tried to understand GNNs through various
frameworks, such as dynamical systems and differential equations [
6
,
16
,
36
,
40
,
45
], Riemannian
geometry [
34
,
41
], algebraic topology [
4
,
24
]. Nevertheless, existing analyses are limited to either
spatial GNNs or spectral GNNs based on the Fourier transform. In fact, it has been empirically
observed that wavelet/framelet-based models tend to alleviate the previously known issues, such as
mitigating the issue of over-smoothing [
43
] and achieving good performance on heterophilic graphs
[
8
]. Despite the success of wavelet-based models, comparatively little is known on how the multi-scale
and frequency-separation properties provided by graph wavelets/framelets potentially avoid the
aforementioned pitfalls and enhance the learning capacity of GNNs.
In this paper, we particularly focus on the graph framelet-based models and analyze the model
behaviors from the perspective of energy gradient flows. In physics, the evolution of particles is
often modelled as differential equations that minimize an energy, known as the gradient flows. The
energy functional and its gradient flow provide a characterization of particles’ states and movements,
which are essential to understand the dynamics. Recently, such idea has been adapted to graph
neural networks [
16
]. By modelling GNNs as (discretized) gradient flows, one can study the limiting
behaviors of GNNs, which are closely related to the notions of over-smoothing, over-separating and
heterophilic graphs. Particularly, [
16
] characterizes the dynamics of GNNs in terms of low-frequency
or high-frequency dominance or both. It is known that low-frequency-dominant (LFD) models tend
to perform well on homophilic graphs (where neighbouring nodes are likely to come from the same
cluster) [
35
,
46
,
29
,
16
] while in contrast, high-frequency-dominant (HFD) models tend to perform
well on heterophilic graphs [
3
,
16
]. In this regard, models that can be both LFD and HFD are
usually preferred.
To the best of our knowledge, this is the first work that provides a theoretical understanding of
multi-scale graph neural networks and justification for its good empirical performance. Specifically,
we highlight our main contributions as follows.
We show the framelet convolutions, either spatial-based [
8
] or spectral-based [
51
] can be viewed
as a discretized gradient flow of some energy. From this point of view, we prove that the
framelet-based GNNs can induce both LFD and HFD dynamics, thus theoretically explaining
its effectiveness on heterophilic graphs and its capability to potentially avoid over-smoothing,
as often empirically observed.
We then define a generalized energy via framelet decomposition. We show such energy includes
the energy proposed in [
16
], which itself is a generalization of the graph Dirichlet energy.
We then propose a GNN model, namely gradient flow based framelet graph convolution
(GradF-UFG), as discretization of the proposed generalized energy.
We show the proposed GradF-UFG includes the framelet-based GNNs as special cases. We also
connect the proposed model with the recently introduced energy enhanced framelet convolution
and analyze its behaviors. We explain how the proposed model provides more flexibility
compared to existing works.
Organization.
The rest of the paper is organized as follows. Section 2 first summarizes the related
works on continuous GNNs and its connections to dynamical systems and differential equations. We
also provide an overview on framelet-based graph representation learning. In section 3, we review
the preliminary knowledge on graphs, graph framelets and framelet convolutions. We also introduce
the notions of graph Dirichlet energy, gradient flow along with the definitions of LFD and HFD
2
dynamics. Section 4 starts with identifying a connection between the spatial framelet convolution
with Dirichlet energy and also characterizes its asymptotic behaviors from the lens of gradient flow.
In section 5, we define the framelet generalized energy and propose a GNN model based on its
gradient flow, where we also connect to existing works. Finally, in section 6, we perform similar
analysis for the spectral framelet convolution.
2 Related works
Dynamical systems, differential equations and GNNs.
Neural ODE [
12
] is introduced as a
continuous version of ResNet [
25
]. Since then, many work has studied its counterparts on graphs
and proposed continuous GNNs, such as [
2
,
38
,
47
]. Another parallel research direction aims to
understand GNNs from continuous dynamical systems and differential equations. GCN [
28
], one
of the most popular GNN models, can be viewed as a discrete Markov process [
36
] and its linear
version is verified as the discretized (isotropic) graph heat diffusion [
45
], which minimizes the graph
Dirichlet energy. Similarly, GAT [
42
] is related to an anisotropic heat diffusion on graphs as shown
in [
6
]. Furthermore, many recent studies introduce GNNs that are inspired by physical systems and
diffusion PDEs, including heat diffusion with a source term [
40
], non-Euclidean Beltrami flow [
7
],
wave diffusion equation [
19
], networks of coupled oscillators [
39
], nonlinear anisotropic diffusion [
11
].
The recent work [
16
] provides a framework for analyzing continuous formulation of GNN models as
gradient flows of some energy. In particular, the work proposes a general energy where its gradient
flow leads to many existing GNN models. Under such framework, [
16
] verifies that many models can
only lead to LFD dynamics, including GCN [28], GRAND [6], CGNN [47] and PDE-GCN [19].
Framelet-based graph learning.
The work [
17
] provides an efficient way to compute graph
framelet transforms via Chebyshev polynomial approximation, which allows practical applications
such as semi-supervised clustering. Graph spectral framelet convolution has been proposed in [
51
,
53
],
which is shown to empirically enhance the graph neural networks with its robustness and multi-scale
properties. Later, graph framelets have been integrated for graph signal denoising [
55
], robust graph
embedding [
54
], dynamic graph [
56
] and directed graph learning [
58
], exhibiting great improvement
in model performance. More recently, [
8
] proposes a spatial graph framelet convolution and shows
its close connection to the Dirichlet energy. The paper also introduces a perturbed Dirichlet energy
in order to mitigate the over-smoothing issue.
3 Preliminaries
Graphs and graph convolution.
A graph
G
= (
VG,EG
)of
n
nodes (with self-loops) can be
represented by graph adjacency matrix
ARn×n
. In this work, we assume the graph is undirected
and unweighted, i.e.,
A
is symmetric with
aij
= 1 if (
i, j
)
∈ EG
and 0otherwise. In addition,
we consider the symmetric normalized adjacency matrix as
b
A
=
D1/2AD1/2
where
D
is the
diagonal degree matrix, with the
i
-th diagonal entry given by
di
=
Pjaij
, the degree of node
i
. The
normalized graph Laplacian is given by
b
L
=
Inb
A
. We use
ρL
to denote the largest eigenvalue (also
called the highest frequency) of
b
L
. From the spectral graph theory [
13
],
ρL
2and the equality
holds if and only if there exists a connected component of the graph Gthat is bipartite.
3
Graph convolution network (GCN) [
28
] defines the layer-wise propagation rule via the normalized
adjacency matrix as
H(`+ 1) = σb
AH(`)W`,(1)
where
H
(
`
)denotes the feature matrix at layer
`
with
H
(0) =
XRn×c
, the input features (also
called input signals), and
W`
is the learnable feature transformation. It can be shown that GCN
corresponds to a localized filter via graph Fourier transform, i.e.,
h
(
`
+ 1) =
U>
(
InΛ
)
Uh
(
`
)where
U,Λ
are from the eigendecomposition
b
L
=
U>ΛU
and
Uh
is known as the Fourier transform of a
graph signal
hRn
. In this paper, we let
{
(
λi,ui
)
}n
i=1
be the set of eigen-pairs of
b
L
where
ui
are
the row vectors of U.
Graph framelets and framelet convolution.
Graph (undecimated) framelets [
53
] are defined
via a filter bank
η
=
{a
;
b(1), ..., b(L)}
and its induced (complex-valued) scaling functions Ψ =
{α
;
β(1), ..., β(L)}
where
L
is the number of high-pass filters. Particularly, it satisfies that
bα
(2
ξ
) =
ba
(
ξ
)
bα
(
ξ
)and
d
β(r)
(2
ξ
) =
d
b(r)
(
ξ
)
bα
(
ξ
)for all
ξR, r
= 1
, ..., L
where
bα, d
β(r)
denote the Fourier
transform of α, β(r)and ba, d
b(r)denote the Fourier series of a, b(r)respectively. The graph framelets
are defined by
ϕj,p
(
v
) =
Pn
i=1 bαλi/2jui
(
p
)
ui
(
v
)and
ψr
j,p
(
v
) =
Pn
i=1 d
β(r)λi/
2
jui
(
p
)
ui
(
v
)for
r
= 1
, ..., L
and for scale level
j
= 1
, ..., J
. We use
ui
(
v
)to represent the eigenvector
ui
at node
v
.
ϕj,p and ψr
j,p are known as the low-pass framelets and high-pass framelets at node p.
The framelet coefficients of a graph signal
h
are given by
v0
=
{hϕ0,p,hi}p∈VG
,
wr
j
=
{hψr
j,p,hi}p∈VG
.
For a multi-channel signal HRn×c, we can compactly write its framelet coefficients as
V0=U>bαΛ
2UH,Wr
j=U>d
β(r)Λ
2j+1 UH,j= 0, ..., J, r = 1, ..., L
where
bα, d
β(r)
applies elementwise to the diagonal of
Λ
, and
V0,Wr
j
are respectively the low-pass and
high-pass coefficients. Define the framelet transform matrices
W0,J ,Wr,j
such that
V0
=
W0,J H
=
U>Λ0,J UH
, and
Wr
j
=
Wr,jH
=
U>Λr,jUH
for
r
= 1
, ..., L, j
= 1
, ..., J
, where
Λr,j
is a diagonal
matrix with entries (
Λ0,J
)
ii
=
bα
(
λi/
2) and (
Λr,j
)
ii
=
d
β(r)
(
λi/
2
j+1
). By the tightness of the framelet
transform, we have
Λ2
0,J
+
Pr,j Λ2
r,j
=
In
and the framelet decomposition and reconstruction are
invertible, i.e.,
W>
0,J W0,J H
+
Pr,j W>
r,jWr,jH
=
H
.
This property allows unique decomposition of
any graph signal onto the spectral framelet domain. Refer to [51, 53] for more detailed discussions.
The spectral graph framelet convolution is proposed in [
51
], which is similar to the graph
(Fourier) convolution by applying a filter on the spectral domain before reconstructing the sig-
nal. The layer-wise propagation rule is given by
H
(
`
+ 1) =
σW>
0,J diag
(
θ0,J
)
W0,J H
(
`
)
W`
+
Pr,j W>
r,jdiag
(
θr,j
)
Wr,jH
(
`
)
W`
, where
θr,j Rn
is a learnable filter coefficient and
W`
is a shared
weight matrix across all
r, j
for layer
`
. Rather than performing spectral filtering as in [
51
], the
spatial graph framelet convolution performs a spatial message passing over the spectral framelet
domain [8] as H(`+ 1) = W>
0,J σb
AW0,J H(`)W`
0,J +Pr,j W>
r,jσb
AWr,jH(`)W`
r,j.
In this paper, our subsequent analysis focuses on the spatial framelet convolution (or just framelet
convolution) due to its connection to the Dirichlet energy as shown in Section 4. Nevertheless, we
also obtain similar conclusions for spectral framelet convolution in Section 6.
With a slight abuse of notations, we use
>
to also represent conjugate transpose if the filters are complex-valued.
4
摘要:

GeneralizedenergyandgradientowviagraphframeletsAndiHan*DaiShi„ZhiqiShao*JunbinGao*AbstractInthiswork,weprovideatheoreticalunderstandingoftheframelet-basedgraphneuralnetworksthroughtheperspectiveofenergygradientow.Byviewingtheframelet-basedmodelsasdiscretizedgradientowsofsomeenergy,weshowitcanindu...

展开>> 收起<<
Generalized energy and gradient flow via graph framelets Andi HanDai ShiZhiqi ShaoJunbin Gao.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:626.99KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注