1 Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations

2025-04-28 0 0 485.06KB 8 页 10玖币
侵权投诉
1
Over-the-Air Federated Learning with Privacy
Protection via Correlated Additive Perturbations
Jialing Liao, Zheng Chen, and Erik G. Larsson
Department of Electrical Engineering (ISY), Link¨
oping University, Link¨
oping, Sweden
Email: {jialing.liao, zheng.chen, erik.g.larsson}@liu.se
Abstract—In this paper, we consider privacy aspects of wireless
federated learning (FL) with Over-the-Air (OtA) transmission
of gradient updates from multiple users/agents to an edge
server. OtA FL enables the users to transmit their updates
simultaneously with linear processing techniques, which improves
resource efficiency. However, this setting is vulnerable to privacy
leakage since an adversary node can hear directly the un-
coded message. Traditional perturbation-based methods provide
privacy protection while sacrificing the training accuracy due
to the reduced signal-to-noise ratio. In this work, we aim at
minimizing privacy leakage to the adversary and the degradation
of model accuracy at the edge server at the same time. More
explicitly, spatially correlated perturbations are added to the
gradient vectors at the users before transmission. Using the
zero-sum property of the correlated perturbations, the side
effect of the added perturbation on the aggregated gradients
at the edge server can be minimized. In the meanwhile, the
added perturbation will not be canceled out at the adversary,
which prevents privacy leakage. Theoretical analysis of the
perturbation covariance matrix, differential privacy, and model
convergence is provided, based on which an optimization problem
is formulated to jointly design the covariance matrix and the
power scaling factor to balance between privacy protection
and convergence performance. Simulation results validate the
correlated perturbation approach can provide strong defense
ability while guaranteeing high learning accuracy.
I. INTRODUCTION
As one instance of distributed machine learning, federated
learning (FL) was developed by Google in 2016, where the
clients can train a model collaboratively by exchanging local
gradients or parameters instead of raw data [1]. Research
activities on FL over wireless networks have attracted wide
attention from various perspectives, such as communication
and energy efficiency, privacy and security issues etc [2], [3].
Communication efficiency is an important design aspect of
wireless FL schemes due to the need of data aggregation over
a large set of distributed nodes with limited communication
resources. Recently, Over-the-Air (OtA) computation has been
applied for model aggregation in wireless FL by exploiting the
waveform superposition property of multiple-access channels
[4], [5]. Under OtA FL, edge devices can transmit local gra-
dients or parameters simultaneously, which is more resource-
efficient than traditional orthogonal multiple access schemes.
Despite the extensive research on wireless FL, recent works
have shown that traditional FL schemes are still vulnerable to
inference attacks on local updates to recover local training
data [6], [7]. One solution is to reduce information disclosure,
This work is supported by Security Link, ELLIIT, and the KAW foundation.
which motivates the usage of compression methods such as
dropout, selective gradients sharing, and dimensionality reduc-
tion [8]–[10], with the drawbacks of limited defense ability and
no accuracy guarantee. Other cryptography technologies, such
as secure multi-party computation and homomorphic encryp-
tion [11], [12] can provide strong privacy guarantees, but yield
more computation and communication costs while being hard
to implement in practice. Due to easy implementation and high
efficiency, perturbation methods such as differential privacy
(DP) [13] or CountSketch matrix [14] have been developed.
DP technique can effectively quantify the difference in output
caused by the change in individual data and reduce information
disclosure by adding noise that follows some distributions
(e.g., Gaussian, Laplacian, Binomial) [13], [15]. In the context
of FL, one can use two DP variants by transmitting perturbed
local updates or global updates, i.e., Local DP and Central
DP [16]. However, DP-based methods fail to achieve high
learning accuracy and defense ability at the same time due to
the reduction of signal-to-noise ratio (SNR), which ultimately
limits their application.
To address this issue, in this paper, we design an efficient
perturbation method for OtA FL with strong defense ability
without significantly compromising the learning accuracy. Un-
like the traditional DP method by adding uncorrelated noise,
we add spatially correlated perturbations to local updates at
different users/agents. We let the perturbations from different
users sum to zero at the edge server such that the learning
accuracy is not compromised (with only slightly decreased
SNR due to less power for actual data transmission). On the
other hand, the perturbations still exist at the adversary due
to the misalignment between the intended channel and the
eavesdropping channel, which can prevent privacy leakage.
A. Related Work
The authors in [17] developed a hybrid privacy-preserving
FL scheme by adding perturbations to both local gradients and
model updates to defend against inference attacks. In [18] the
client anonymity in OtA FL was exploited by randomly sam-
pling the devices participating and distributing the perturbation
generation across clients to ensure privacy resilience against
the failure of clients. Without adversaries but with a curious
server, the trade-offs between learning accuracy, privacy, and
wireless resources were discussed in [19]. Later on, authors of
[20] developed a privacy-preserving FL scheme under orthog-
onal multiple access (OMA) and OtA, respectively, proving
arXiv:2210.02235v1 [cs.LG] 5 Oct 2022
2
w(t)
Adversary
Edge device
Edge
server
Dataset
()
Fig. 1. A federated edge learning system with an adversary that can eavesdrop
on the local gradients transmitted from the devices.
that the inherent anonymity of OtA channels can hide local
updates to ensure high privacy. This framework was extended
to a reconfigurable intelligent surface (RIS)-enabled OtA FL
system by exploiting the channel reconfigurability with RIS
[21]. However, the aforementioned approaches reduce privacy
leakage at the cost of degrading learning accuracy.
To this end, authors in [22] developed a server-aware
perturbation method where the server can eliminate the per-
turbations before aggregation, which requires extra processing
and coordination. A more efficient way to balance accuracy
and privacy is to guarantee that the inserted perturbations add
up to zero. To the best of our knowledge, this strategy has not
been explored in wireless FL, although similar ideas exist in
the literature of consensus and secure sharing domains. For
instance, pair-wise secure keys were exploited in [23] where
each user masked its local update via random keys assigned in
pairs with opposite signs such that the keys add up to zero. In
[24], the perturbation was generated temporally correlated with
a geometrically decreasing variance over iterations such that
the perturbation adds up to zero after multiple iterations. Com-
pared with these methods, we provide fundamental analysis of
general spatially correlated perturbations based on covariance
matrix rather than a special case mentioned in [23]. Though
the privacy analysis is discussed in the context of the Gaussian
mechanism, extensions to other distributions are possible.
II. SYSTEM MODEL
As shown in Fig. 1, we consider a wireless FL system
where Ksingle-antenna devices intend to transmit gradient
updates to an edge server with OtA computation. An adversary
is located near one of the users, which intends to overhear
the transmissions and infer knowledge about the training
data. Each user k∈ {1,2,...K},Khas a local dataset
Dk={(uk
i, vk
i)}Dk
i=1 composed of Dkdata points, where uk
i
is the i-th data point and vk
iis the corresponding label. The
global dataset is then denoted by D=K
k=1Dkwith the total
size given by Dtot =PK
k=1 Dk. For sake of brevity, we assume
that users have equal-sized datasets, i.e., Dk=D, k∈ K.1
The size of the global dataset is thereby given by Dtot =KD.
1The results can be extended to the case where there are datasets with
distinct sizes as that does not affect the main structure of the privacy analysis.
Suppose the users jointly train a learning model wRd
by minimizing the global loss function F(w), i.e., w=
arg minwF(w). FL is an iteration process where in every
round, each user kobtains its local gradient vector Fk(w)
using its local dataset. Then, the edge server estimates the
global gradient vector by aggregating the received gradient
vectors from the users, then update the model parameter vector
wto all users. In total, T= [1,2, . . . , T ]rounds of iteration
is considered.
We assume that the edge server and the users are all honest.
However, the external adversary is honest-but-curious, which
means that it does not attempt to perturb the aggregated
gradients but only eavesdrops on the gradient information in
order to infer knowledge about the local datasets. Note that in
this paper we focus on the uplink transmission of the local
gradient updates from the users to the edge server, which
belongs to the setting of local DP. The privacy leakage in
the downlink transmission of the global model updates to the
users is reserved for future work.
III. PROBLEM FORMULATION
A. Communication Protocol with Correlated Perturbations
Let x(t)
krepresent the transmitted signal from the k-th user
to the edge server during the uplink transmission of local
gradient updates in the t-th round/iteration. The received signal
at the edge server is
y(t)=
K
X
k=1
h(t)
kx(t)
k+z(t),(1)
where h(t)
kCis the channel gain from user k. The channel
noise z(t)is identically and independently distributed (i.i.d.)
in all iterations, and follows CN (0, N0Id). To reduce the
information leakage to the adversary, we add perturbations
to introduce randomness in the transmitted gradient data.
This means that instead of transmitting the true gradient
Fk(w(t)), the k-th user transmits the following noisy update2
x(t)
k=α(t)
kFk(w(t)) + n(t)
k,(2)
where α(t)
kCdenotes the transmit scaling factor, given by
α(t)
k=pη(t)/h(t)
k,(3)
and η(t)R+is the common power scaling factor. The trans-
mitted signal consists of two components: the local gradient
Fk(w(t)), and the d×1artificial noise vector n(t)
kCd. It
is assumed that each user has limited power budget P, i.e.,
E[kx(t)
kk2]P. (4)
Substituting the transmitted signal x(t)
kinto (1), the received
signal at the edge server becomes
y(t)=
K
X
k=1 pη(t)Fk(w(t)) + n(t)
k+z(t).(5)
2To utilize both the real part and the imaginary part, we split Fk(w(t))
to construct a complex vector with the components [Fk(w(t))]i+
j[Fk(w(t))]i+d/2, i = 1, . . . d/2. For simplicity, we keep the notations
Fk(w(t))and d. A de-splitting process is done at the receiver nodes.
摘要:

1Over-the-AirFederatedLearningwithPrivacyProtectionviaCorrelatedAdditivePerturbationsJialingLiao,ZhengChen,andErikG.LarssonDepartmentofElectricalEngineering(ISY),Link¨opingUniversity,Link¨oping,SwedenEmail:fjialing.liao,zheng.chen,erik.g.larssong@liu.seAbstract—Inthispaper,weconsiderprivacyaspectsof...

展开>> 收起<<
1 Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:485.06KB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注