modal brain networks since there exist heterogeneous struc-
tures and representations among multiple modalities. How-
ever, most existing studies potentially ignore these issues and
achieves sub-optimal results.
In terms of this, we propose a Multi-modal Dynamic
Graph Convolution Network (MDGCN) to model multi-modal
complementary associations by dynamic graphs. Our net-
work allows for tighter coupling of context between multiple
modalities by representing functional and structural connec-
tome dynamically and providing a compositional space for
reasoning. Specially, we first parse both the functional and
structural into dynamic graphs with embedded representations
as nodes. A correspondence factor matrix is introduced to
capture the corresponding values of each pair of nodes be-
tween modalities, which is denoted as the adjacency matrix.
And multimodal representations are aggregated by a Bilateral
Graph Convolution (BGC) layer for complementary message
passing. Extensive experiments on three datasets demonstrate
that our proposed method outperforms other baselines in the
prediction of Mild Cognitive Impairment (MCI), Parkinson’s
Disease (PD), and Schizophrenia (SCHZ) with the accuracy
performances of 90.4%, 85.9%, and 98.3% respectively.
The rest of our paper is structured as follows. We would
like to review competitive methods in terms of connectome
study and multi-modal models in Section II. The details of
the proposed model are introduced in Section III. Section IV
describes the experiments of our proposed model in disease
classification on 3 datasets and provides the experimental
results. Section V draws the conclusions of the work.
II. RELATED WORKS
A. Brain connectome network study
With the flexibility of uncovering the complex biological
mechanisms using rs-fMRI and DTI, deep learning methods
have been widely coordinated to examine and analyze the
patterns. Convolution neural networks (CNN) and graph neural
networks (GNN) have become useful tools for brain con-
nectome embedding, where high dimensional neuroimaging
features are embedded into a low dimensional space that pre-
serves their context as well as capturing topological attributes.
BrainNetCNN is proposed to take the brain connectome net-
works as grid-like data, and measure the topological locality in
connectome [11], which has achieved promising performance
for disease diagnosis and phenotype prediction.
Apart from convolution neural networks, graph neural net-
works retain a state that can represent information about
the neighbors and provides a powerful way to explore the
dependencies between nodes. However, applying a graph
network directly to the brain connectome is problematic. On
one hand, brain networks have sophisticated and non-linear
structures. For example, most existing methods apply the
derived functional connectivity as the adjacency matrix, which
is measured linearly between two brain regions. These de-
rived linear connectivities fail to model complex associations
between brain regions. On the other hand, graph convolution
networks explicitly require a known graph structure, which is
not available in the brain connectome. Several strategies have
been proposed to tackle the unknown structure issue [12]–[14].
Especially, dynamic graph convolution methods are proposed
to model graph structures adaptively to characterize intrinsic
brain connectome representations and achieve promising per-
formances in prediction [9], [15]. Nevertheless, there is still a
lack of studies to tackle the multi-modal connectome graphs.
B. Multi-modal connectome learning
Existing multi-modal connectome learning methods can
be categorized into two classes: feature learning methods
and deep learning methods. Compared with feature learning
methods [16]–[18] that leverage feature selection to identify
disease-related features, deep learning methods are feasible
to capture intrinsic meaningful representations and achieves
better performances. [19] devised a calibration mechanism to
fuse fMRI and DTI information into edges. [20] proposed
to perform a two-layer convolution on the fMRI and DTI
data simultaneously. [8] regularizes convolution on functional
connectivity with structural graph Laplacian. However, most
of these studies lack the ability to sufficiently model com-
plementary associations between modalities, since there is a
lack of joint compositional reasoning over both functional and
structural connectome networks.
III. METHOD
The proposed Multi-modal Dynamic Graph Convolution
Network (MDGCN) aims at parsing multi-modal representa-
tions into dynamic graphs and performing graph aggregation
for message passing. In this section, we would like to firstly
introduce the brain graph definition, and then the detail of the
proposed method.
A. Preliminaries
Brain network graph: The brain networks derived from
neuroimages are usually symmetric positive define (SPD)
matrices X∈RM×M, where Mdenotes the number of
brain regions. Each element xi,j denotes a co-variance or
connectivity strength between the regions. The brain network
is usually formulated as an undirected graph G= (V, E, H),
where Vis a finite set of vertices with |V|=Mand
E∈RM×Mdenotes the edges in the graphs. The nodes and
edges are represented by the derived SPD matrices X. For each
vertex vi, the node feature vector hiis constructed by the i-th
row or column in the SPD matrix hi={xi,k|k= 1,2, ..., M}.
The edges are represented by the matrices directly, of which
an element is assigned by ei,j =xi,j .
Multi-modal brain graph: The multi-modal brain graphs
are constructed by the functional and structural brain networks
derived from fMRI and DTI respectively. An input ˆ
Gis ex-
pressed by a tuple of graphs as ˆ
G={Gs, Gf}, where Gsand
Gfdenote the structural and functional brain network graphs
respectively. Formally, given a set of graphs {ˆ
G1,ˆ
G2, ..., ˆ
GN}
with a few labeled graph instances, the aim of the study
is to decide the state of the unlabeled graphs as a graph
classification task.