1 Over-the-Air Split Machine Learning in Wireless MIMO Networks

2025-04-28 0 0 1.51MB 15 页 10玖币
侵权投诉
1
Over-the-Air Split Machine Learning in Wireless
MIMO Networks
Yuzhi Yang, Zhaoyang Zhang, Yuqing Tian,
Zhaohui Yang, Chongwen Huang, Caijun Zhong, and Kai-Kit Wong
Abstract—In split machine learning (ML), different partitions
of a neural network (NN) are executed by different computing
nodes, requiring a large amount of communication cost. As over-
the-air computation (OAC) can efficiently implement all or part
of the computation at the same time of communication, thus by
substituting the wireless transmission in the traditional split ML
framework with OAC, the communication load can be eased. In
this paper, we propose to deploy split ML in a wireless multiple-
input multiple-output (MIMO) communication network utilizing
the intricate interplay between MIMO-based OAC and NN. The
basic procedure of the OAC split ML system is first provided,
and we show that the inter-layer connection in a NN of any size
can be mathematically decomposed into a set of linear precoding
and combining transformations over a MIMO channel carrying
out multi-stream analog communication. The precoding and
combining matrices which are regarded as trainable parameters,
and the MIMO channel matrix, which are regarded as unknown
(implicit) parameters, jointly serve as a fully connected layer
of the NN. Most interestingly, the channel estimation procedure
can be eliminated by exploiting the MIMO channel reciprocity
of the forward and backward propagation, thus greatly saving
the system costs and/or further improving its overall efficiency.
The generalization of the proposed scheme to the conventional
NNs is also introduced, i.e., the widely used convolutional neural
networks. We demonstrate its effectiveness under both the static
and quasi-static memory channel conditions with comprehensive
simulations.
Index Terms—Over-the-air computing (OAC), multiple-input
multiple-output (MIMO), split machine learning, neural network
I. INTRODUCTION
A. Motivation
In future sixth-generation (6G) wireless communication
systems, human-like intelligence will be brought everywhere
[1]. The rapid development of artificial intelligence leads
to booming mobile machine learning (ML) applications and
This work was supported in part by National Key R&D Program of China
under Grant 2020YFB1807101 and 2018YFB1801104, and National Natural
Science Foundation of China under Grant U20A20158, and 61725104.
Y. Yang, Z. Zhang (Corresponding Author), Y. Tian, Z. Yang, C. Huang,
and C. Zhong are with the College of Information Science and Electronic
Engineering, Zhejiang University, Hangzhou 310007, China, and with the
International Joint Innovation Center, Zhejiang University, Haining 314400,
China, and also with Zhejiang Provincial Key Lab of Information Process-
ing, Communication and Networking (IPCAN), Hangzhou 310007, China.
Z. Yang is also with Zhejiang Lab, Hangzhou, 311121, China. (e-mails:
{yuzhi yang, ning ming, tianyq, yang zhaohui, chongwenhuang, caijun-
zhong}@zju.edu.cn)
K.-K. Wong is with the Department of Electronic and Electrical Engineer-
ing, University College London, WC1E 6BT London, UK, and also with
School of Integrated Technology, Yonsei University, Seoul, 03722, Korea.
(email: kai-kit.wong@ucl.ac.uk)
requires vast data interaction. The integration of ML and
wireless communications leads to an emerging area called
split ML, which distributes a neural network (NN) to several
edge devices, thus reducing the system computation burden.
In a split ML system, each device first proceeds forward
computation on the allocated partial NN and then transmits
calculated intermediate results to the next device for further
computation. The backward propagation process is conducted
in a way similar to forward transmission but in a backward
order. Through cooperative computation, split ML can be ap-
plied to solve deep learning-based joint source-channel coding
(JSCC) problem [2] and other ML problems in cloud net-
works and the Internet of things. However, split ML requires
frequent information exchange among devices, leveraging a
heavy communication burden on wireless networks. Thus,
deploying split ML over wireless networks calls for the design
of new wireless techniques based on a communication-and-
computation integration approach.
Recently, over-the-air computation (OAC) has emerged as
a meaningful approach to innovate the traditional wireless
digital communication framework [3]. Through using the in-
trinsic linear superposition property of wireless channels, OAC
enables wireless communication systems to compute some
large-scale, but simple calculation tasks [4]. Previous work
[5] shows that if a group of devices simultaneously transmits
analog modulated signals, the receiver obtains an aggregated
signal, which can be directly applied to the typical federated
learning (FL) network. However, traditional OAC work only
considers the weighted sum of the edge users’ messages within
a time-synchronized multiple-access framework and has not
been well suited to the split ML system. In a typical wireless
system with multiple-input multiple-output (MIMO) antenna
arrays, a group of antennas at the transmitter sends different
signals simultaneously, while another group of antennas at
the receiver collects the transmitted signal. On the receiver
side, each antenna can individually receive the aggregated
signal from multiple antennas of the transmitter. Intuitively,
the MIMO channel can be viewed as a multiplication-and-
addition procedure on the transmitted analog signals, which is
widely used in NNs, and thus is of potential to be applied to
the split ML in general wireless systems.
There is an intricate interplay between MIMO-based OAC
and NN. A MIMO channel can provide a full connection
between the inputs and the outputs and can thus be viewed
as a weighted sum calculator. By different channel gains,
each antenna on the receiver side conducts calculations with
different equivalent weights. Unlike fully connected layers in
arXiv:2210.04742v2 [cs.LG] 11 Dec 2022
2
NNs, which can be viewed as controlled weighted sums, the
equivalent weights in MIMO systems are determined by the
channel matrices, which are determined by the environment
and hence uncontrollable. To control the equivalent weight in
such systems, the typical precoding and combining operations
in a MIMO system can be properly exploited. Both procedures
are controllable linear transformations on the signals and are
inherent in MIMO systems. As a result, we can control the
parameters in the equivalent weighted sum of the overall
system by controlling the precoding and combining matrices.
However, implementing the MIMO OAC in split ML still
faces two fundamental issues: the forward and backward
channels are different and may not be accurately known, and
the analog transmission in OAC results in unavoidable noise.
To deal with the uncertainty issue of the MIMO channel, we
find that the forward-backward propagation of a NN and the
channel reciprocity of a wireless channel are mathematically
related (see Section II-B for details). This can be exploited to
deploy a NN through MIMO-based OAC, which can still lead
to correct gradients even without any prior knowledge about
the MIMO channel as long as the channel reciprocity and
quasi-stability are assumed. Moreover, most NNs can work
well after proper training, even when the intermediate results
cannot be fully and accurately interpreted. Such unexplainably
in NNs inspires us that casting a deterministic linear transfor-
mation, i.e., multiplying an implicit matrix determined by the
MIMO channel on the intermediate results of a NN through the
whole training and test process, may not intensively deteriorate
the performance.
The other issue about the noise in wireless communication
is not fatal in NNs. From the perspective of information theory,
both digital and analog communication can be optimal in wire-
less communications such as sensor networks [6]. However,
the digital communication system can reduce transmission
errors by employing error-detecting and correcting codes,
whereas transmission errors can only be restricted but never
eliminated in analog communication. Moreover, it is found that
noise is tolerable and sometimes even becomes a training trick
in NNs [7], which can also be viewed as implicit dropout [8].
Since the devices transmit unexplainable, intermediate results
of the NN in split ML systems, the advantage of progressive
error-free is insignificant in such applications. For instance,
Jankowski et al. [2] show that analog communication performs
better than digital communication in the deep learning-based
JSCC problem, which is a particular case of split ML.
B. Related Works
Split ML is a method where multiple computation nodes
cooperatively execute an ML application. In such a system,
the ML model is split into multiple parts allocated to different
computation nodes. Each node executes a part of the ML
model in order and transmits the intermediate results to the
next one. When training the NN, the nodes also execute
backward propagation in backward order. Most of the existing
works [9]–[11] on split ML focus on how to distribute the
model on the nodes in a way to minimize the total communi-
cation and computation delay. There are also some other works
bringing up specialized NN structures for split ML [12]–[14].
Recent work also tries to deploy a proper NN architecture
on a given communication network [15]. Their framework
utilizes the neural architecture search method to meet latency
and accurate requirements. However, the above works [9]–[15]
all consider ideal communication among the computing nodes,
which ignores the communication scheme design.
On the other hand, in communication systems, OAC is
usually deployed in multiple access systems to compute the
weighted sum or some easy mathematical operations such
as geometric mean, polynomial, and Euclidean norm [4].
Hence most OAC works are mainly used in FL, where the
weighted sum is widely deployed. For example, the authors
in [5] optimize the number of simultaneous accesses, which
may improve the efficiency of FL. Besides, Zhu et al. apply
broadband analog aggregation to improve OAC in FL with
multiple bands [16], while Shao et al. consider FL with
misaligned OAC [17].
OAC for multiple access systems still has significant draw-
backs. Firstly, OAC requires strict synchronization among all
transmitters, which is hard to realize. Moreover, OAC does not
support backward communication, which is rarely considered
in previous works as far as we know. Furthermore, the above
works [4], [5], [16], [17] do not consider the MIMO system,
which is widely used in practical scenarios. The multiple
antennas of the transmitter in a MIMO system can be viewed
as a group of transmitters in the multiple access scheme, which
overcomes the drawback of synchronization. The authors of
[18] apply MIMO OAC to multimodal sensing. However, in
[18], the scenario is still a multiple access system where
all channels are MIMO channels, and the task of OAC is
still the weighted sum, which can be regarded as a direct
expansion of previous designs in [4], [5], [16], [17]. Besides,
the implementation of [18] is also strictly limited to FL
applications, which is not suitable for split ML.
In other fields, OAC has also provided an alternative to
traditional NNs by realizing parts of NNs with acoustic [19],
optical [20], and radio frequency [21] signals. The authors in
[19]–[21] use the characteristics of the target systems similar
to NNs, and employ the target systems as part of a NN. OAC
systems calculate aggregated results of multiple inputs from
different transmitters or time slots by moderating the envi-
ronments or some parts of the system. Recently, in wireless
communications, Sanchez et al. also realize NNs with the
help of multiple paths, and reconfigurable intelligent surfaces
[22]. Their system transmits the intermediate output of NNs
via time-sequential signals and uses the delay of multiple
paths to realize one-dimensional convolution. However, in this
paper, we use MIMO channels to realize fully connected layers
through multiplexed signals.
C. Contributions
The main contributions of this paper are summarized as
follows:
A split ML framework is proposed for wireless MIMO
networks by exploiting the MIMO’s OAC capability,
which not only enables high-throughput and efficient
3
wireless transmission but also reduces the overall compu-
tation load by synergistically incorporating the split ML
process with the wireless transmission procedure rather
than just taking it as a bit pipe.
We show that the inter-layer connection in a NN of any
size can be mathematically decomposed into a set of
linear precoding and combining transformations over the
MIMO channels. Therefore, the precoding matrix at the
transmitter and the combining matrix at the receiver of
each MIMO link, as well as the channel matrix itself, can
jointly serve as a fully connected layer of a NN.
By exploiting the reciprocity of the MIMO channel in
the forward and backward propagation procedures in the
proposed framework, we find it unnecessary to conduct
explicit channel estimation as otherwise indispensable
in conventional communication systems, thus further
improving the overall communication and computation
efficiency.
We also provide some design rules for the proposed
system so as to apply it to a fully connected layer of any
size in a fully connected NN or a convolutional layer of
any size in a convolutional NN. Simulation results show
that the proposed scheme is efficient under both static
and slowly-varying memory channel conditions.
The remainder of the paper is organized as follows. We
first introduce the proposed system in Section II. We then
mathematically provide some principles and propose a training
algorithm in Section III. We extend the proposed system to
convolutional NNs and compare different implementations of
the system in Section IV. Numerical results are provided in
Section V. Section VI concludes the paper and provides future
directions.
D. Notations
In this paper, we use bold italic lower-case letters for vectors
and bold letters for matrices. All the vectors and matrices
are assumed to be complex. The meanings of frequently used
notations are summarized in detail in Table I.
TABLE I: Notations used in this paper
Notation Meaning
Nt, NrThe number of antennas on the transmitter and the receiver.
Ni, NoThe input and output sizes of a NN layer.
x,yThe signals before precoding, and after combining,
also for the input and output of a NN layer.
xk,ykThe transmitted signals after precoding
and the received signal before combining,
HThe channel matrix.
Pk,CkThe precoding and combining matrices of transmission k.
nThe noise vector.
rThe roughly estimated rank of H.
WA trainable matrix in a NN.
KA trainable convolutional kernel in a NN.
gaThe gradient corresponding to the subscript.
II. BASIC OAC UNIT
In this section, we first briefly introduce the system model
and then propose a MIMO OAC-based approach to accelerate
communication in split ML.
A. System Settings
Consider a MIMO OAC-based split ML system with mul-
tiple base stations, as shown in Fig. 1. Each base station is
equipped with multiple antennas, and the MIMO channels
among base stations are quasi-stable. A deep NN is split
into several segmented NN parts, each of which is deployed
on one specific base station. The base stations process the
forward computation and backward propagation of the as-
signed NN fragment. For simplicity, unless otherwise stated,
we consider the split ML system with only one splitting point
since it can be easily generalized to multiple-split conditions
by conducting a set of single-split systems. To simplify the
description, we use “transmitter” and “receiver” to refer to
the transmitter and receiver with Ntand Nrantennas in the
forward transmission, respectively.
Assuming that the channel between the transmitter and the
receiver is a quasi-stable MIMO channel with reciprocity,
i.e., the forward channel is H∈ CNr×Nt, and the backward
channel is HT, we do not make any further priori hypotheses
of the channel as the split ML applications may be deployed
in different scenarios. Under the system setting, we assume
that the rank of channel His known to be r, which determines
the maximum amount of dataflows transmitted simultaneously
under H. We note that the rank ris mainly determined by the
number of paths, which can be approximately known from
the channel model and the number of scatterers. Therefore, al-
though obtaining the accurate value of ris impossible without
channel estimation, we can get rroughly. Some small singular
values below a certain threshold can be regarded as zeros in the
aspect of engineering. Hence, the rank rcan be underestimated
in some cases. When designing the system, we assume that
ris known exactly, and we later show the cases where there
exists a mismatch numerically in Section V. Besides, to better
realize OAC, we may apply other techniques, such as applying
the orthogonal frequency division multiplexing technique to
realize multiple transmissions simultaneously. However, we
do not consider such techniques since they do not affect the
principles we use in the paper and can be easily combined
with the proposed system.
B. OAC for Split ML
We consider a two-node split ML system, where the original
NN is split into two parts, corresponding to the first several
layers and the other layers, which are respectively deployed
on the transmitter and the receiver. A fully connected layer
between the transmitter and the receiver is realized through
the OAC technique. We consider complex NNs [23], which
have shown effectiveness in graph classification tasks in the
proposed system. Since most data in wireless communication
is complex, complex NNs are also potential in native wireless
communication learning tasks. In complex NNs, the param-
eters, gradients, as well as intermediate results are complex
numbers, and the backpropagation is almost the same as that
of traditional real NNs. The transmitter can be regarded as a
federation of several synchronized single-antenna transmitters
in MIMO systems. Hence OAC still works similarly with mul-
tiple access systems. We will later discuss how to extend the
摘要:

1Over-the-AirSplitMachineLearninginWirelessMIMONetworksYuzhiYang,ZhaoyangZhang,YuqingTian,ZhaohuiYang,ChongwenHuang,CaijunZhong,andKai-KitWongAbstract—Insplitmachinelearning(ML),differentpartitionsofaneuralnetwork(NN)areexecutedbydifferentcomputingnodes,requiringalargeamountofcommunicationcost.Asove...

展开>> 收起<<
1 Over-the-Air Split Machine Learning in Wireless MIMO Networks.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.51MB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注