
FedGraph ’22, October 21, 2022, Atlanta, GA, USA Susheel Suresh, Danny Godbout, Arko Mukherjee, Mayank Shrivastava, Jennifer Neville, and Pan Li
methods assume that training labels (node / graph level) for the cor-
responding tasks
1
are readily available at the clients and a global
model is trained end-to-end in a federated fashion. However, in
many cross-silo applications, clients might have very little or no
labeled data points. It is a well known that annotating labels of
node / graph data takes a lot of time and resources [
18
,
62
], e.g.,
diculties in obtaining explicit user feedback in social network
applications and costly in vitro experiments for biological networks.
Moreover, certain clients may be unwilling to share labels due to
competition or other regulatory reasons. (2)
Downstream task
heterogeneity
- it is reasonable to assume that while clients may
share the same graph data domain, the downstream tasks may be
client-dependent and vary signicantly across clients. It is also
reasonable to expect that some clients may have new downstream
tasks added at a later point, where a model supervised by previous
tasks may be ineective.
With these observations, we propose a realistic and unexplored
problem setting for FedGRL: Participating clients have a shared space
of graph-structured data, though the distributions may dierent across
clients. And, clients have the access to vast amounts of unlabeled data.
Additionally, they may have very dierent local downstream tasks
with very few private labeled data points. Fundamentally, our prob-
lem setting asks if one can leverage unlabeled data across clients to
learn a shared graph representation (akin to “knowledge transfer")
which can then be further personalized to perform well in the lo-
cal downstream tasks at each client. In a data centralized training
regime, a number of works that utilize GNN pre-training [
18
,
38
]
and self-supervision [
44
,
45
,
51
,
53
] have shown the benets of such
approaches in dealing with label deciency and transfer learning
scenarios which motivate us to explore and utilize them for the
proposed FedGRL problem setting.
In this paper, we propose a novel FedGRL formulation based on
model interpolation where we aim to learn a shared global model
that is optimized collaboratively using a self-supervised objective
and gets downstream task supervision through local client models.
We provide a specic instantiation of our general formulation using
BGRL [
45
] a SoTA self-supervised graph representation learning
method and we empirically verify its eectiveness through real-
istic cross-slio datasets: (1) we adapt the Twitch Gamer Network
which naturally simulates a cross-geo scenario and show that our
formulation can provide consistent and avg. 6.1% gains over tra-
ditional supervised federated learning objectives and on avg. 1.7%
gains compared to individual client specic self-supervised training
and (2) we construct and introduce a new cross-silo dataset called
Amazon Co-purchase Networks that have both the characteristics
of the motivated problem setting. We rstly show how standard
supervised federated objectives can result in negative gains (on avg.
-4.16%) compared to individual client specic supervised training,
due to the increased data heterogeneity and limited label availabil-
ity. Then we experimentally verify the eectiveness of our method
and witness on avg. 11.5% gains over traditional supervised fed-
erated learning and on avg. 1.9% gains over individually trained
self-supervised models. Both experimental results point to the ef-
fectiveness of our proposed formulation.
1e.g., classication / regression problems at node or whole graph level.
The remainder of this paper is organized as follows, in Sec. 2 we
review relevant work related to FL for graph structured data, self-
supervised techniques for GNNs and nally some recent work on
tackling label deciency with FL. In Sec. 3 we introduce notation and
some preliminaries, later in Sec. 4, we provide a detailed problem
setup, introduce our formulation and its instantiations. Later in
Sec. 5 we provide detailed experimental setup and nally present
experimental results in Sec 6.
2 RELATED WORK
The broad elds of designing GNNs for graph representation learn-
ing (GRL) get detailed coverage in recent surveys [
3
,
14
,
49
]. We
refer the reader to [20, 27, 28] for a an overview of FL methods.
2.1 Federated Learning for Graphs
FedGRL is a new research topic and current works have considered
the following two main problem formulations.
First, for node-level tasks (predicting node labels), there are three
sub categories based on the degree of overlap in graph nodes across
clients: (1) No node overlap between client graphs. Here, each client
maintains a GNN model which is trained on the local node labels
and the server aggregates the parameters of the client GNN models
and communicates it back in every federation round [
5
,
55
,
56
].
ASFGNN [
56
] additionally tackles the non-IID data issue using split
based learning and FedGraph [
5
] focuses on eciency and utilizes
a privacy preserving cross-client GNN convolution operation. Fed-
Sage [
55
] considers a slightly dierent formulation, wherein each
client has access to disjoint subgraphs of some global graph. They
utilize GraphSage [
13
] and train it with label information and fur-
ther propose to train a missing neighbor generator to deal with
missing links across local subgraphs. (2) Partial node overlap across
clients. Here, each participating client holds subgraphs which may
have overlapping nodes with other clients graphs. GraphFL [
48
]
considers this scenario and utilizes a meta-learning based federated
learning algorithm to personalize client models to downstream
tasks. [
36
] considers overlapping nodes in local client knowledge
graphs and utilize them to translate knowledge embedding across
clients. (3) Complete node overlap across clients. Here all clients
hold the same set of nodes, they upload node embeddings instead of
model parameters to the server for FL aggregation. Existing works
focus on the vertically partitioned citation network data [
35
,
57
].
Note that all the above problem settings are dierent from ours in
motivation as we focus on label deciency and downstream task
heterogeneity.
Secondly, for graph-level tasks (predicting graph labels), each
client has a local set of labeled graphs and the goal is to learn
one global model or personalized local models using federation.
This problem setting is fundamentally similar to other federated
learning settings widely considered in vision and language domains.
One needs to replace the previous linear/DNN encoder into a graph
kernel/GNN encoder to handle the graph data modality. [
16
] creates
a benchmark towards this end. The issues of client data non-IID
ness carry over to the graph domain as well and [
50
] utilizes client
clustering to aggregate model parameters.