Shielding Federated Learning Mitigating Byzantine Attacks with Less Constraints

2025-04-15 0 0 983.53KB 8 页 10玖币

侵权投诉

Shielding Federated Learning: Mitigating Byzantine Attacks with Less Constraints

Minghui Li∗, Wei Wan∗, Jianrong Lu∗, Shengshan Hu∗, Junyu Shi∗,

Leo Yu Zhang†, Man Zhou∗, and Yifeng Zheng‡

∗Huazhong University of Science and Technology, Wuhan, China

†Deakin University, Melbourne, Australia

‡Harbin Institute of Technology, Shenzhen, China

Abstract—Federated learning is a newly emerging distributed

learning framework that facilitates the collaborative training

of a shared global model among distributed participants with

their privacy preserved. However, federated learning systems

are vulnerable to Byzantine attacks from malicious partici-

pants, who can upload carefully crafted local model updates

to degrade the quality of the global model and even leave a

backdoor. While this problem has received signiﬁcant attention

recently, current defensive schemes heavily rely on various

assumptions, such as a ﬁxed Byzantine model, availability of

participants’ local data, minority attackers, IID data distribu-

tion, etc.

To relax those constraints, this paper presents Robust-FL,

the ﬁrst prediction-based Byzantine-robust federated learning

scheme where none of the assumptions is leveraged. The core

idea of the Robust-FL is exploiting historical global model to

construct an estimator based on which the local models will

be ﬁltered through similarity detection. We then cluster local

models to adaptively adjust the acceptable differences between

the local models and the estimator such that Byzantine users

can be identiﬁed. Extensive experiments over different datasets

show that our approach achieves the following advantages

simultaneously: (i) independence of participants’ local data, (ii)

tolerance of majority attackers, (iii) generalization to variable

Byzantine model.

Index Terms—Federated Learning, Byzantine Attacks, Byzan-

tine Robustness, Privacy Protection

1. Introduction

Recently emerged federated learning (FL) [29] is a new

computing paradigm that trains a global machine learning

model over distributed data while protecting participants’

privacy. By distributing the model learning process to par-

ticipants, FL constructs a global model from user-speciﬁc

local models, such that participants’ data never leaves their

own devices. In this way, the bandwidth cost is signiﬁcantly

reduced and user privacy is well protected.

Due to the decentralized nature, FL is vulnerable to

Byzantine attacks [12], [22], where malicious participants

can falsify real models or gradients to damage the learn-

ing process, or directly poison the training data to make

the global model learn wrong information or even leave a

backdoor. In the literature, various attack methods have been

proposed to demonstrate the vulnerabilities of FL. For ex-

ample, pixel-pattern backdoor attack [9] adds a pre-deﬁned

pixel pattern to a fraction of training data and modiﬁes

the corresponding labels. Label ﬂipping attack [7] will train

the local model by combining correct samples with ﬂipped

labels. These two attacks aim at reducing the recognition

rate of the local models by tampering with the training data.

Another kind of attack method focuses on manipulating the

local models, such as bit-ﬂip attack [27] which modiﬁes a

part of the local model parameters by ﬂipping speciﬁed bits,

and sign-ﬂipping attack [14] which ﬂips the signs of local

model parameters and enlarges the magnitudes. Recently a

distributed backdoor attack [24] is proposed to show the

possibility of uniting multiple participants to conduct an

attack, where a backdoor trigger can be decomposed and

embedded into different adversarial parties.

To mitigate Byzantine attacks, a mounting number of

defense schemes have been proposed [2], [3], [18], [23],

[26]. They mainly focused on comparing participants’ local

models to remove anomalous ones before aggregating them.

These solutions, however, suffer from various limitations

that make them unsuitable to be applied in practice. For ex-

ample, the famous defense scheme Multi-Krum [2] assumes

that data is independently and identically distributed (IID)

and cannot deal with Non-IID datasets. FABA [23] assumes

a ﬁxed Byzantine model and needs to know the number

of malicious participants in advance before detection. Di-

verseFL [18] requires a part of participants’ local dataset

to help detect anomalous models, which apparently violates

the privacy principle of FL. The most recently proposed

defense FLTrust [4] is not able to identify the Byzantine

participants. A comprehensive comparison among existing

defensive schemes is shown in Table 1.

To get rid of these limitations, we propose Robust-

FL, the ﬁrst prediction-based Byzantine-robust FL scheme.

Different from existing works that focus on making use of

local models in the current iteration, Robust-FL aims to

construct an estimator based on the historical global models

from previous rounds. The local models that signiﬁcantly

differ from the current estimator are expected to have a

higher possibility of being malicious, and will be discarded.

In detail, we ﬁrst make use of exponential smoothing to

arXiv:2210.01437v2 [cs.DC] 13 Oct 2022

construct the estimator, which enjoys a high efﬁciency for

detection, especially when there are large-scale clients in

federated learning. We then propose using a small public

dataset (i.e., less than 10 samples) to train an initial global

model, which is crucial for improving the detection accu-

racy.

In summary, we make the following contributions:

•We propose a new Byzantine-robust federated learn-

ing scheme called Robust-FL. To the best of our

knowledge, Robust-FL is the ﬁrst predication-based

defense scheme that can mitigate Byzantine attacks

effectively and efﬁciently without relying on any

fallacious assumptions.

•We propose incorporating clustering algorithms to

adaptively adjust the differences between the estima-

tor and local models, such that a boundary between

benign and malicious models can be effectively af-

ﬁrmed to identify Byzantine participants.

•We conduct extensive experiments to evaluate

Robust-FL. The results show that Robust-FL is still

effective even more than 50% participants are com-

promised, the Byzantine models are variable, and the

participants’ data are not available, while all existing

defenses are invalid under this severe scenario.

2. Related Work

In order to resist Byzantine attacks, researchers have

proposed many defensive schemes in recent years. We divide

them into three categories according to the principles that the

server relied on to detect or evade anomalous local models.

Distance-based defenses: The ﬁrst category focuses on

comparing the distances between the local models to ﬁnd

out anomalous ones. Krum [2] aims to choose one local

model that is closest to its K−f−2neighbors, where

Kis the number of participants and fis the number of

malicious users. Since Krum converges slowly, the authors

introduced its variant Multi-Krum, which chooses K−f

local models for aggregation rather than just one. Similar

to Multi-Krum, FABA [23] iteratively removes the local

model that is farthest from the average model until the

number of eliminated models is f. FoolsGold [7] uses cosine

similarity to identify malicious models and then assigns

them smaller weights to reduce their impact on the averaged

global model. Sniper [3] selects local models for aggregation

based on a graph which is constructed according to the

Euclidean distances between the local models. The PCA

scheme [20] projects local updates into two-dimensional

space and uses a clustering algorithm to ﬁnd malicious

updates. MAB-RFL [21] is also equipped with PCA and

clustering algorithm to identify malicious updates, in ad-

dition, a momentum based approach is applied to tackle

the data heterogeneity (i.e., Non-IID) challenge. All these

solutions (except MAB-RFL), however, only work well over

independently and identically distributed (IID) data, and

they cannot tolerate more than 50% attackers. Besides, most

of them need to know the number of attackers in advance.

Statistics-based defenses: The second category exploits

the statistical characteristics to remove statistical outliners.

Instead of performing detection-then-aggregation, Trimmed

Mean [30] directly uses all the local updates to obtain a

new global model, by computing the median or the trimmed

mean of all local models in each dimension. Geometric

Median [25] intends to ﬁnd a new update that minimizes

the summation of the distances between the update and each

local model. The RFA scheme [17] computes the geometric

median of the local models with an alternating minimization

approach to reduce the computational overhead. Bulyan [16]

ﬁrst uses Multi-Krum to remove malicious models and

then aggregates the rest models based on Trimmed Mean.

SLSGD [26] also adopts Trimmed Mean as the aggrega-

tion rule, and then uses a newly proposed moving-average

method, which considers global models in this round and the

last round. Nevertheless, the above schemes cannot identify

Byzantine users, and they perform poorly when there are

more than 50% Byzantine users.

Performance-based defenses: The last category de-

pends on the validation dataset to evaluate the performance

of the uploaded parameters. Li et al. [14] proposed us-

ing a pre-trained autoencoder to detect malicious models.

Zeno [27] computes the stochastic descendant score for each

gradient based on a validation dataset and then removes

the gradients with low stochastic descendant scores. Cao

et al. [5] proposed a Byzantine-robust distributed gradient

algorithm, which computes a noisy gradient based on a clean

dataset, and a gradient is accepted only when its distance

between the noisy gradient satisﬁes a pre-deﬁned condition.

Prakash et al. [18] proposed DiverseFL, which ﬁrst com-

putes a guiding gradient for each user based on the data

the user shares, and then two similarity metrics (Direction

Similarity and Length Similarity) between the local gradient

and the corresponding guiding gradient are considered, only

when both metrics are satisﬁed will the gradient be accepted.

FLTrust [4] bootstraps trust with a clean training dataset

collected by the server. More speciﬁcally, the RELU-clipped

cosine-similarity between each local update and the server

update (calculated on the cleaning dataset) is employed to

reweight the local update, such that malicious updates have

a limited impact on the global model. However, algorithm

[14] requires a lot of data to obtain benign models and trains

autoencoder based on the benign models, but in reality, it is

hard to obtain so much data. Although the rest four schemes

require few data, they have other limitations. For instance,

Zeno needs to know the number of attackers in advances;

scheme [5] relies on an appropriate hyper-parameter to dis-

tinguish benign gradients from malicious ones; DiverseFL

compels users to share their private data, which violates the

original intention of FL; FLTrust cannot identify Byzantine

users, which means that malicious updates can also partici-

pate in aggregation to deteriorate the accuracy of the global

model.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ShieldingFederatedLearning:MitigatingByzantineAttackswithLessConstraintsMinghuiLi,WeiWan,JianrongLu,ShengshanHu,JunyuShi,LeoYuZhangy,ManZhou,andYifengZhengzHuazhongUniversityofScienceandTechnology,Wuhan,ChinayDeakinUniversity,Melbourne,AustraliazHarbinInstituteofTechnology,Shenzhen,ChinaAbstr...

展开>> 收起<<

Shielding Federated Learning Mitigating Byzantine Attacks with Less Constraints.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Shielding Federated Learning Mitigating Byzantine Attacks with Less Constraints

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: