
In the context of labeling criteria varying across centers, how can we effectively incorporate the
commonly used FL pipeline (e.g., FedAvg) to jointly learn an FL model in the desired label space?
100
010
010
001
0.6 00.4
(a) Different label spaces (b) Our proposed FedMT
!"!
"*
Class overlap
{%!,'
%!}&and&{%",'
%!}
!
!
"= FedAvg(!
#!
")
'#$(", #
")
!$
!#
!
")
!
"!
!!
'#$(!
", 2#
#!)'#$(!
", 2#
##)
FedMT(T)
Desired label space
The other label space
Known correspondence matrix .
Same backbone
Client update
Estimate .
#
'
Predict
Client True label
Pseudo label
Count
!
"!!
""!
"#
"!3 0 0
""0 2 0
"#0 4 0
"$0 0 4
"%3 0 2
Normalize
+
,!
=
FedMT(E) : plug-in
FedAvg
.
"
#
Update order
Same backbone
Client update
Same backbone
Server update
Figure 1: Illustration of the problem setting and our proposed FedMT method. (a) We consider different
label spaces (i.e., desired label space
Y
with
K
classes and the other space
e
Y
with
J
classes) where classes may overlap,
such as
e
Y1
and
Y2
. Annotation within the desired label space is typically more challenging and resource-intensive,
resulting in a scarcity of labeled samples. (b) We employ a fixed label space correspondence matrix
M
to establish
associations between label spaces, effectively linking
e
Y
and
Y
. Our method, denoted as FedMT (T), locally corrects
class scores
f
using
M
within the FedAvg framework. In instances where the correspondence matrix
M
is unknown, we
propose a pseudo-label based method to estimate
b
M
. Subsequently, FedMT (E) incorporates
b
M
into the loss function
to correct class scores.
Problem setting: To address the question, we consider a simplified but generalizable setting as illustrated in
Fig. 1 that two types of labeling criteria exist in FL. The label spaces are not necessarily nested, namely, it is
possible for a class in one label space to overlap with multiple classes in another space (e.g., disease diagnoses
often exhibit imperfect agreement). Additionally, drawing inspiration from a healthcare scenario, we assume a
limited availability of labeled data (
<
5%) within the desired label space. One supercenter adheres to the
complex labeling criterion according to the desired label space, while others use another distinct simpler
labeling criterion. The supercenter serves as the coordinating server for FL but also engages in local model
updating like other clients. All centers jointly train an FL model following the standard FL training protocol,
as shown in Fig. 1 (b).
Under the problem setting described above, alternative approaches to handling different label spaces
include personalized FL (Collins et al., 2021). However, these methods often neglect to exploit the inherent
correspondence between different label spaces. Alternatively, transfer learning (Yang et al., 2019), which
involves pre-training a model in one space and fine-tuning it in other spaces, can be an alternative solution in
FL, but suboptimal pre-training may lead to negative transfer (Chen et al., 2019). Other centralized strategies
that require the pooling of all data features for similarity comparison through complex training strategies (Hu
et al., 2022) increase privacy risk and are impractical for widely used FL methods like FedAvg. In light of the
limitations associated with these methods, we aim to address two key challenges: a) simultaneously leveraging
different types of labels and their correspondences without additional feature exchange, and b) learning the
FL model end-to-end.
In this work, we introduce a plug-and-play method called FedMT. This versatile strategy seamlessly
integrates with various FL pipelines, such as FedAvg. Specifically, we use models with identical architectures,
with the output dimension matching the number of classes in the desired label space across all centers. To
utilize client data from other label spaces for supervision, we employ a probability projection to align the two
spaces by mapping class scores.
Contributions: Our contributions are multifaceted. First, we explore a novel and underexplored problem
setting–FL under mixed-type labels. This is particularly significant in real-world applications, notably
within the realm of medical care. Second, we propose a novel and versatile FL method, FedMT, which is a
computationally efficient and versatile solution; Third, we provide theoretical analysis on the generalization
error of learning from data using mixed-type with projection matrix; Lastly, our approach shows better
performance in predicting in the desired label space compared to other methods, demonstrated by extensive
experiments on benchmark datasets in different settings and real medical data. Additionally, we are also able
to predict in the other space as a byproduct and we observe improved classification compared with other
2