Shielding Federated Learning Mitigating Byzantine Attacks with Less Constraints

2025-04-15
0
0
983.53KB
8 页
10玖币
侵权投诉
Shielding Federated Learning: Mitigating Byzantine Attacks with Less Constraints
Minghui Li∗, Wei Wan∗, Jianrong Lu∗, Shengshan Hu∗, Junyu Shi∗,
Leo Yu Zhang†, Man Zhou∗, and Yifeng Zheng‡
∗Huazhong University of Science and Technology, Wuhan, China
†Deakin University, Melbourne, Australia
‡Harbin Institute of Technology, Shenzhen, China
Abstract—Federated learning is a newly emerging distributed
learning framework that facilitates the collaborative training
of a shared global model among distributed participants with
their privacy preserved. However, federated learning systems
are vulnerable to Byzantine attacks from malicious partici-
pants, who can upload carefully crafted local model updates
to degrade the quality of the global model and even leave a
backdoor. While this problem has received significant attention
recently, current defensive schemes heavily rely on various
assumptions, such as a fixed Byzantine model, availability of
participants’ local data, minority attackers, IID data distribu-
tion, etc.
To relax those constraints, this paper presents Robust-FL,
the first prediction-based Byzantine-robust federated learning
scheme where none of the assumptions is leveraged. The core
idea of the Robust-FL is exploiting historical global model to
construct an estimator based on which the local models will
be filtered through similarity detection. We then cluster local
models to adaptively adjust the acceptable differences between
the local models and the estimator such that Byzantine users
can be identified. Extensive experiments over different datasets
show that our approach achieves the following advantages
simultaneously: (i) independence of participants’ local data, (ii)
tolerance of majority attackers, (iii) generalization to variable
Byzantine model.
Index Terms—Federated Learning, Byzantine Attacks, Byzan-
tine Robustness, Privacy Protection
1. Introduction
Recently emerged federated learning (FL) [29] is a new
computing paradigm that trains a global machine learning
model over distributed data while protecting participants’
privacy. By distributing the model learning process to par-
ticipants, FL constructs a global model from user-specific
local models, such that participants’ data never leaves their
own devices. In this way, the bandwidth cost is significantly
reduced and user privacy is well protected.
Due to the decentralized nature, FL is vulnerable to
Byzantine attacks [12], [22], where malicious participants
can falsify real models or gradients to damage the learn-
ing process, or directly poison the training data to make
the global model learn wrong information or even leave a
backdoor. In the literature, various attack methods have been
proposed to demonstrate the vulnerabilities of FL. For ex-
ample, pixel-pattern backdoor attack [9] adds a pre-defined
pixel pattern to a fraction of training data and modifies
the corresponding labels. Label flipping attack [7] will train
the local model by combining correct samples with flipped
labels. These two attacks aim at reducing the recognition
rate of the local models by tampering with the training data.
Another kind of attack method focuses on manipulating the
local models, such as bit-flip attack [27] which modifies a
part of the local model parameters by flipping specified bits,
and sign-flipping attack [14] which flips the signs of local
model parameters and enlarges the magnitudes. Recently a
distributed backdoor attack [24] is proposed to show the
possibility of uniting multiple participants to conduct an
attack, where a backdoor trigger can be decomposed and
embedded into different adversarial parties.
To mitigate Byzantine attacks, a mounting number of
defense schemes have been proposed [2], [3], [18], [23],
[26]. They mainly focused on comparing participants’ local
models to remove anomalous ones before aggregating them.
These solutions, however, suffer from various limitations
that make them unsuitable to be applied in practice. For ex-
ample, the famous defense scheme Multi-Krum [2] assumes
that data is independently and identically distributed (IID)
and cannot deal with Non-IID datasets. FABA [23] assumes
a fixed Byzantine model and needs to know the number
of malicious participants in advance before detection. Di-
verseFL [18] requires a part of participants’ local dataset
to help detect anomalous models, which apparently violates
the privacy principle of FL. The most recently proposed
defense FLTrust [4] is not able to identify the Byzantine
participants. A comprehensive comparison among existing
defensive schemes is shown in Table 1.
To get rid of these limitations, we propose Robust-
FL, the first prediction-based Byzantine-robust FL scheme.
Different from existing works that focus on making use of
local models in the current iteration, Robust-FL aims to
construct an estimator based on the historical global models
from previous rounds. The local models that significantly
differ from the current estimator are expected to have a
higher possibility of being malicious, and will be discarded.
In detail, we first make use of exponential smoothing to
arXiv:2210.01437v2 [cs.DC] 13 Oct 2022
construct the estimator, which enjoys a high efficiency for
detection, especially when there are large-scale clients in
federated learning. We then propose using a small public
dataset (i.e., less than 10 samples) to train an initial global
model, which is crucial for improving the detection accu-
racy.
In summary, we make the following contributions:
•We propose a new Byzantine-robust federated learn-
ing scheme called Robust-FL. To the best of our
knowledge, Robust-FL is the first predication-based
defense scheme that can mitigate Byzantine attacks
effectively and efficiently without relying on any
fallacious assumptions.
•We propose incorporating clustering algorithms to
adaptively adjust the differences between the estima-
tor and local models, such that a boundary between
benign and malicious models can be effectively af-
firmed to identify Byzantine participants.
•We conduct extensive experiments to evaluate
Robust-FL. The results show that Robust-FL is still
effective even more than 50% participants are com-
promised, the Byzantine models are variable, and the
participants’ data are not available, while all existing
defenses are invalid under this severe scenario.
2. Related Work
In order to resist Byzantine attacks, researchers have
proposed many defensive schemes in recent years. We divide
them into three categories according to the principles that the
server relied on to detect or evade anomalous local models.
Distance-based defenses: The first category focuses on
comparing the distances between the local models to find
out anomalous ones. Krum [2] aims to choose one local
model that is closest to its K−f−2neighbors, where
Kis the number of participants and fis the number of
malicious users. Since Krum converges slowly, the authors
introduced its variant Multi-Krum, which chooses K−f
local models for aggregation rather than just one. Similar
to Multi-Krum, FABA [23] iteratively removes the local
model that is farthest from the average model until the
number of eliminated models is f. FoolsGold [7] uses cosine
similarity to identify malicious models and then assigns
them smaller weights to reduce their impact on the averaged
global model. Sniper [3] selects local models for aggregation
based on a graph which is constructed according to the
Euclidean distances between the local models. The PCA
scheme [20] projects local updates into two-dimensional
space and uses a clustering algorithm to find malicious
updates. MAB-RFL [21] is also equipped with PCA and
clustering algorithm to identify malicious updates, in ad-
dition, a momentum based approach is applied to tackle
the data heterogeneity (i.e., Non-IID) challenge. All these
solutions (except MAB-RFL), however, only work well over
independently and identically distributed (IID) data, and
they cannot tolerate more than 50% attackers. Besides, most
of them need to know the number of attackers in advance.
Statistics-based defenses: The second category exploits
the statistical characteristics to remove statistical outliners.
Instead of performing detection-then-aggregation, Trimmed
Mean [30] directly uses all the local updates to obtain a
new global model, by computing the median or the trimmed
mean of all local models in each dimension. Geometric
Median [25] intends to find a new update that minimizes
the summation of the distances between the update and each
local model. The RFA scheme [17] computes the geometric
median of the local models with an alternating minimization
approach to reduce the computational overhead. Bulyan [16]
first uses Multi-Krum to remove malicious models and
then aggregates the rest models based on Trimmed Mean.
SLSGD [26] also adopts Trimmed Mean as the aggrega-
tion rule, and then uses a newly proposed moving-average
method, which considers global models in this round and the
last round. Nevertheless, the above schemes cannot identify
Byzantine users, and they perform poorly when there are
more than 50% Byzantine users.
Performance-based defenses: The last category de-
pends on the validation dataset to evaluate the performance
of the uploaded parameters. Li et al. [14] proposed us-
ing a pre-trained autoencoder to detect malicious models.
Zeno [27] computes the stochastic descendant score for each
gradient based on a validation dataset and then removes
the gradients with low stochastic descendant scores. Cao
et al. [5] proposed a Byzantine-robust distributed gradient
algorithm, which computes a noisy gradient based on a clean
dataset, and a gradient is accepted only when its distance
between the noisy gradient satisfies a pre-defined condition.
Prakash et al. [18] proposed DiverseFL, which first com-
putes a guiding gradient for each user based on the data
the user shares, and then two similarity metrics (Direction
Similarity and Length Similarity) between the local gradient
and the corresponding guiding gradient are considered, only
when both metrics are satisfied will the gradient be accepted.
FLTrust [4] bootstraps trust with a clean training dataset
collected by the server. More specifically, the RELU-clipped
cosine-similarity between each local update and the server
update (calculated on the cleaning dataset) is employed to
reweight the local update, such that malicious updates have
a limited impact on the global model. However, algorithm
[14] requires a lot of data to obtain benign models and trains
autoencoder based on the benign models, but in reality, it is
hard to obtain so much data. Although the rest four schemes
require few data, they have other limitations. For instance,
Zeno needs to know the number of attackers in advances;
scheme [5] relies on an appropriate hyper-parameter to dis-
tinguish benign gradients from malicious ones; DiverseFL
compels users to share their private data, which violates the
original intention of FL; FLTrust cannot identify Byzantine
users, which means that malicious updates can also partici-
pate in aggregation to deteriorate the accuracy of the global
model.
摘要:
展开>>
收起<<
ShieldingFederatedLearning:MitigatingByzantineAttackswithLessConstraintsMinghuiLi,WeiWan,JianrongLu,ShengshanHu,JunyuShi,LeoYuZhangy,ManZhou,andYifengZhengzHuazhongUniversityofScienceandTechnology,Wuhan,ChinayDeakinUniversity,Melbourne,AustraliazHarbinInstituteofTechnology,Shenzhen,ChinaAbstr...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
相关推荐
-
VIP免费2024-11-14 22
-
VIP免费2024-11-23 3
-
VIP免费2024-11-23 4
-
VIP免费2024-11-23 3
-
VIP免费2024-11-23 4
-
VIP免费2024-11-23 28
-
VIP免费2024-11-23 11
-
VIP免费2024-11-23 21
-
VIP免费2024-11-23 12
-
VIP免费2024-11-23 5
分类:学术论文
价格:10玖币
属性:8 页
大小:983.53KB
格式:PDF
时间:2025-04-15
作者详情
-
Voltage-Controlled High-Bandwidth Terahertz Oscillators Based On Antiferromagnets Mike A. Lund1Davi R. Rodrigues2Karin Everschor-Sitte3and Kjetil M. D. Hals1 1Department of Engineering Sciences University of Agder 4879 Grimstad Norway10 玖币0人下载
-
Voltage-controlled topological interface states for bending waves in soft dielectric phononic crystal plates10 玖币0人下载