
task contains a small number of labeled nodes as references (i.e., support nodes) and several unlabeled
nodes for classification (i.e., query nodes). To extract transferable knowledge from classes with
abundant labeled nodes, the model is trained on a series of meta-training tasks that are sampled from
these disjoint classes but share similar structures with meta-test tasks. We refer to meta-training
and meta-test tasks as meta-tasks. Note that few-shot node classification can be conducted on a
single graph (e.g., a citation network for author classification) or across multiple graphs (e.g., a set
of protein-protein interaction networks for protein property predictions). Here each meta-task is
sampled from one single graph in both single-graph and multiple-graph settings, since each meta-test
task is conducted on one graph. Despite the success of recent studies on few-shot node classification,
they mainly learn node representations from the original graph (i.e., the graph that the meta-task
is sampled from). However, the original graph can be redundant and uninformative for a specific
meta-task as each meta-task only contains a small number of nodes. As a result, the learned node
representations are not tailored for the meta-task (i.e., task-specific), which increases the difficulties
of few-shot learning. Thus, instead of leveraging the same original graph for all meta-tasks, it is
crucial to learn a task-specific structure for each meta-task.
Intuitively, the task-specific structure should contain nodes in the meta-task along with other relevant
nodes from the original graph. Moreover, the edge weights among these nodes should also be
learned in a task-specific manner. Nevertheless, it remains a daunting problem to learn a task-specific
structure for each meta-task due to two challenges: (1) It is non-trivial to select relevant nodes for the
task-specific structure. Particularly, this structure should contain nodes that are maximally relevant to
the support nodes in the meta-task. Nevertheless, since each meta-task consists of multiple support
nodes, it is difficult to select nodes that are relevant to the entire support node set. (2) It is challenging
to learn edge weights for the task-specific structure. The task-specific structure should maintain
strong correlations for nodes in the same class, so that the learned node representations will be similar.
Nonetheless, the support nodes in the same class could be distributed across the original graph, which
increases the difficulty of enhancing such correlations for the task-specific structure learning.
To address these challenges, we propose a novel
G
raph few-shot
L
earning framework w
IT
h
T
ask-
sp
E
cific st
R
uctures - GLITTER, which aims at effectively learning a task-specific structure for each
meta-task in graph few-shot learning. Specifically, to reduce the irrelevant information from the
original graph, we propose to select nodes via two strategies according to their overall node influence
on support nodes in each meta-task. Moreover, we learn edge weights in the task-specific structure
based on node influence within classes and mutual information between query nodes and labels. With
the learned task-specific structures, our framework can effectively learn node representations that are
tailored for each meta-task. In summary, the main contributions of our framework are as follows:
(1) We selectively extract relevant nodes from the original graph and learn a task-specific structure
for each meta-task based on node influence and mutual information. (2) The proposed framework
can handle graph few-shot learning under both single-graph and multiple-graph settings. Differently,
most existing works only focus on the single-graph setting. (3) We conduct extensive experiments on
five real-world datasets under single-graph and multiple-graph settings. The superior performance
over the state-of-the-art methods further validates the effectiveness of our framework.
2 Problem Formulation
Denote the set of input graphs as
G={G1, . . . , GM}
(for the single-graph setting,
|G| = 1
), where
M
is the number of graphs. Here each graph can be represented as
G= (V,E,X)
, where
V
is the
set of nodes,
E
is the set of edges, and
X∈R|V|×d
is a feature matrix with the
i
-th row vector (
d
-
dimensional) representing the attribute of the
i
-th node. Under the prevalent meta-learning framework,
the training process is conducted on a series of meta-training tasks
{T1,...,TT}
, where
T
is the
number of meta-training tasks. More specifically,
Ti={Si,Qi}
, where
Si
is the support set of
Ti
and
consists of
K
labeled nodes for each of
N
classes (i.e.,
|Si|=NK
). The corresponding label set of
Ti
is
Yi
, where
|Yi|=N
.
Yi
is sampled from the whole training label set
Ytrain
. With
Si
as references,
the model is required to classify nodes in the query set
Qi
, which contains
Q
unlabeled samples.
Note that the actual labels of query nodes are from
Yi
. After training, the model will be evaluated on
a series of meta-test tasks, which follow a similar setting as meta-training tasks, except that the label
set in each meta-test task is sampled from a distinct label set
Ytest
(i.e.,
Ytest ∩ Ytrain =∅
). It is
noteworthy that under the multiple-graph setting, meta-training and meta-test tasks can be sampled
from different graphs, while each meta-task is sampled from one single graph.
2