1 Over-the-Air Split Machine Learning in Wireless MIMO Networks

2025-04-28 0 0 1.51MB 15 页 10玖币

侵权投诉

Over-the-Air Split Machine Learning in Wireless

MIMO Networks

Yuzhi Yang, Zhaoyang Zhang, Yuqing Tian,

Zhaohui Yang, Chongwen Huang, Caijun Zhong, and Kai-Kit Wong

Abstract—In split machine learning (ML), different partitions

of a neural network (NN) are executed by different computing

nodes, requiring a large amount of communication cost. As over-

the-air computation (OAC) can efﬁciently implement all or part

of the computation at the same time of communication, thus by

substituting the wireless transmission in the traditional split ML

framework with OAC, the communication load can be eased. In

this paper, we propose to deploy split ML in a wireless multiple-

input multiple-output (MIMO) communication network utilizing

the intricate interplay between MIMO-based OAC and NN. The

basic procedure of the OAC split ML system is ﬁrst provided,

and we show that the inter-layer connection in a NN of any size

can be mathematically decomposed into a set of linear precoding

and combining transformations over a MIMO channel carrying

out multi-stream analog communication. The precoding and

combining matrices which are regarded as trainable parameters,

and the MIMO channel matrix, which are regarded as unknown

(implicit) parameters, jointly serve as a fully connected layer

of the NN. Most interestingly, the channel estimation procedure

can be eliminated by exploiting the MIMO channel reciprocity

of the forward and backward propagation, thus greatly saving

the system costs and/or further improving its overall efﬁciency.

The generalization of the proposed scheme to the conventional

NNs is also introduced, i.e., the widely used convolutional neural

networks. We demonstrate its effectiveness under both the static

and quasi-static memory channel conditions with comprehensive

simulations.

Index Terms—Over-the-air computing (OAC), multiple-input

multiple-output (MIMO), split machine learning, neural network

I. INTRODUCTION

A. Motivation

In future sixth-generation (6G) wireless communication

systems, human-like intelligence will be brought everywhere

[1]. The rapid development of artiﬁcial intelligence leads

to booming mobile machine learning (ML) applications and

This work was supported in part by National Key R&D Program of China

under Grant 2020YFB1807101 and 2018YFB1801104, and National Natural

Science Foundation of China under Grant U20A20158, and 61725104.

Y. Yang, Z. Zhang (Corresponding Author), Y. Tian, Z. Yang, C. Huang,

and C. Zhong are with the College of Information Science and Electronic

Engineering, Zhejiang University, Hangzhou 310007, China, and with the

International Joint Innovation Center, Zhejiang University, Haining 314400,

China, and also with Zhejiang Provincial Key Lab of Information Process-

ing, Communication and Networking (IPCAN), Hangzhou 310007, China.

Z. Yang is also with Zhejiang Lab, Hangzhou, 311121, China. (e-mails:

{yuzhi yang, ning ming, tianyq, yang zhaohui, chongwenhuang, caijun-

zhong}@zju.edu.cn)

K.-K. Wong is with the Department of Electronic and Electrical Engineer-

ing, University College London, WC1E 6BT London, UK, and also with

School of Integrated Technology, Yonsei University, Seoul, 03722, Korea.

(email: kai-kit.wong@ucl.ac.uk)

requires vast data interaction. The integration of ML and

wireless communications leads to an emerging area called

split ML, which distributes a neural network (NN) to several

edge devices, thus reducing the system computation burden.

In a split ML system, each device ﬁrst proceeds forward

computation on the allocated partial NN and then transmits

calculated intermediate results to the next device for further

computation. The backward propagation process is conducted

in a way similar to forward transmission but in a backward

order. Through cooperative computation, split ML can be ap-

plied to solve deep learning-based joint source-channel coding

(JSCC) problem [2] and other ML problems in cloud net-

works and the Internet of things. However, split ML requires

frequent information exchange among devices, leveraging a

heavy communication burden on wireless networks. Thus,

deploying split ML over wireless networks calls for the design

of new wireless techniques based on a communication-and-

computation integration approach.

Recently, over-the-air computation (OAC) has emerged as

a meaningful approach to innovate the traditional wireless

digital communication framework [3]. Through using the in-

trinsic linear superposition property of wireless channels, OAC

enables wireless communication systems to compute some

large-scale, but simple calculation tasks [4]. Previous work

[5] shows that if a group of devices simultaneously transmits

analog modulated signals, the receiver obtains an aggregated

signal, which can be directly applied to the typical federated

learning (FL) network. However, traditional OAC work only

considers the weighted sum of the edge users’ messages within

a time-synchronized multiple-access framework and has not

been well suited to the split ML system. In a typical wireless

system with multiple-input multiple-output (MIMO) antenna

arrays, a group of antennas at the transmitter sends different

signals simultaneously, while another group of antennas at

the receiver collects the transmitted signal. On the receiver

side, each antenna can individually receive the aggregated

signal from multiple antennas of the transmitter. Intuitively,

the MIMO channel can be viewed as a multiplication-and-

addition procedure on the transmitted analog signals, which is

widely used in NNs, and thus is of potential to be applied to

the split ML in general wireless systems.

There is an intricate interplay between MIMO-based OAC

and NN. A MIMO channel can provide a full connection

between the inputs and the outputs and can thus be viewed

as a weighted sum calculator. By different channel gains,

each antenna on the receiver side conducts calculations with

different equivalent weights. Unlike fully connected layers in

arXiv:2210.04742v2 [cs.LG] 11 Dec 2022

NNs, which can be viewed as controlled weighted sums, the

equivalent weights in MIMO systems are determined by the

channel matrices, which are determined by the environment

and hence uncontrollable. To control the equivalent weight in

such systems, the typical precoding and combining operations

in a MIMO system can be properly exploited. Both procedures

are controllable linear transformations on the signals and are

inherent in MIMO systems. As a result, we can control the

parameters in the equivalent weighted sum of the overall

system by controlling the precoding and combining matrices.

However, implementing the MIMO OAC in split ML still

faces two fundamental issues: the forward and backward

channels are different and may not be accurately known, and

the analog transmission in OAC results in unavoidable noise.

To deal with the uncertainty issue of the MIMO channel, we

ﬁnd that the forward-backward propagation of a NN and the

channel reciprocity of a wireless channel are mathematically

related (see Section II-B for details). This can be exploited to

deploy a NN through MIMO-based OAC, which can still lead

to correct gradients even without any prior knowledge about

the MIMO channel as long as the channel reciprocity and

quasi-stability are assumed. Moreover, most NNs can work

well after proper training, even when the intermediate results

cannot be fully and accurately interpreted. Such unexplainably

in NNs inspires us that casting a deterministic linear transfor-

mation, i.e., multiplying an implicit matrix determined by the

MIMO channel on the intermediate results of a NN through the

whole training and test process, may not intensively deteriorate

the performance.

The other issue about the noise in wireless communication

is not fatal in NNs. From the perspective of information theory,

both digital and analog communication can be optimal in wire-

less communications such as sensor networks [6]. However,

the digital communication system can reduce transmission

errors by employing error-detecting and correcting codes,

whereas transmission errors can only be restricted but never

eliminated in analog communication. Moreover, it is found that

noise is tolerable and sometimes even becomes a training trick

in NNs [7], which can also be viewed as implicit dropout [8].

Since the devices transmit unexplainable, intermediate results

of the NN in split ML systems, the advantage of progressive

error-free is insigniﬁcant in such applications. For instance,

Jankowski et al. [2] show that analog communication performs

better than digital communication in the deep learning-based

JSCC problem, which is a particular case of split ML.

B. Related Works

Split ML is a method where multiple computation nodes

cooperatively execute an ML application. In such a system,

the ML model is split into multiple parts allocated to different

computation nodes. Each node executes a part of the ML

model in order and transmits the intermediate results to the

next one. When training the NN, the nodes also execute

backward propagation in backward order. Most of the existing

works [9]–[11] on split ML focus on how to distribute the

model on the nodes in a way to minimize the total communi-

cation and computation delay. There are also some other works

bringing up specialized NN structures for split ML [12]–[14].

Recent work also tries to deploy a proper NN architecture

on a given communication network [15]. Their framework

utilizes the neural architecture search method to meet latency

and accurate requirements. However, the above works [9]–[15]

all consider ideal communication among the computing nodes,

which ignores the communication scheme design.

On the other hand, in communication systems, OAC is

usually deployed in multiple access systems to compute the

weighted sum or some easy mathematical operations such

as geometric mean, polynomial, and Euclidean norm [4].

Hence most OAC works are mainly used in FL, where the

weighted sum is widely deployed. For example, the authors

in [5] optimize the number of simultaneous accesses, which

may improve the efﬁciency of FL. Besides, Zhu et al. apply

broadband analog aggregation to improve OAC in FL with

multiple bands [16], while Shao et al. consider FL with

misaligned OAC [17].

OAC for multiple access systems still has signiﬁcant draw-

backs. Firstly, OAC requires strict synchronization among all

transmitters, which is hard to realize. Moreover, OAC does not

support backward communication, which is rarely considered

in previous works as far as we know. Furthermore, the above

works [4], [5], [16], [17] do not consider the MIMO system,

which is widely used in practical scenarios. The multiple

antennas of the transmitter in a MIMO system can be viewed

as a group of transmitters in the multiple access scheme, which

overcomes the drawback of synchronization. The authors of

[18] apply MIMO OAC to multimodal sensing. However, in

[18], the scenario is still a multiple access system where

all channels are MIMO channels, and the task of OAC is

still the weighted sum, which can be regarded as a direct

expansion of previous designs in [4], [5], [16], [17]. Besides,

the implementation of [18] is also strictly limited to FL

applications, which is not suitable for split ML.

In other ﬁelds, OAC has also provided an alternative to

traditional NNs by realizing parts of NNs with acoustic [19],

optical [20], and radio frequency [21] signals. The authors in

[19]–[21] use the characteristics of the target systems similar

to NNs, and employ the target systems as part of a NN. OAC

systems calculate aggregated results of multiple inputs from

different transmitters or time slots by moderating the envi-

ronments or some parts of the system. Recently, in wireless

communications, Sanchez et al. also realize NNs with the

help of multiple paths, and reconﬁgurable intelligent surfaces

[22]. Their system transmits the intermediate output of NNs

via time-sequential signals and uses the delay of multiple

paths to realize one-dimensional convolution. However, in this

paper, we use MIMO channels to realize fully connected layers

through multiplexed signals.

C. Contributions

The main contributions of this paper are summarized as

follows:

•A split ML framework is proposed for wireless MIMO

networks by exploiting the MIMO’s OAC capability,

which not only enables high-throughput and efﬁcient

wireless transmission but also reduces the overall compu-

tation load by synergistically incorporating the split ML

process with the wireless transmission procedure rather

than just taking it as a bit pipe.

•We show that the inter-layer connection in a NN of any

size can be mathematically decomposed into a set of

linear precoding and combining transformations over the

MIMO channels. Therefore, the precoding matrix at the

transmitter and the combining matrix at the receiver of

each MIMO link, as well as the channel matrix itself, can

jointly serve as a fully connected layer of a NN.

•By exploiting the reciprocity of the MIMO channel in

the forward and backward propagation procedures in the

proposed framework, we ﬁnd it unnecessary to conduct

explicit channel estimation as otherwise indispensable

in conventional communication systems, thus further

improving the overall communication and computation

efﬁciency.

•We also provide some design rules for the proposed

system so as to apply it to a fully connected layer of any

size in a fully connected NN or a convolutional layer of

any size in a convolutional NN. Simulation results show

that the proposed scheme is efﬁcient under both static

and slowly-varying memory channel conditions.

The remainder of the paper is organized as follows. We

ﬁrst introduce the proposed system in Section II. We then

mathematically provide some principles and propose a training

algorithm in Section III. We extend the proposed system to

convolutional NNs and compare different implementations of

the system in Section IV. Numerical results are provided in

Section V. Section VI concludes the paper and provides future

directions.

D. Notations

In this paper, we use bold italic lower-case letters for vectors

and bold letters for matrices. All the vectors and matrices

are assumed to be complex. The meanings of frequently used

notations are summarized in detail in Table I.

TABLE I: Notations used in this paper

Notation Meaning

Nt, NrThe number of antennas on the transmitter and the receiver.

Ni, NoThe input and output sizes of a NN layer.

x,yThe signals before precoding, and after combining,

also for the input and output of a NN layer.

xk,ykThe transmitted signals after precoding

and the received signal before combining,

HThe channel matrix.

Pk,CkThe precoding and combining matrices of transmission k.

nThe noise vector.

rThe roughly estimated rank of H.

WA trainable matrix in a NN.

KA trainable convolutional kernel in a NN.

gaThe gradient corresponding to the subscript.

II. BASIC OAC UNIT

In this section, we ﬁrst brieﬂy introduce the system model

and then propose a MIMO OAC-based approach to accelerate

communication in split ML.

A. System Settings

Consider a MIMO OAC-based split ML system with mul-

tiple base stations, as shown in Fig. 1. Each base station is

equipped with multiple antennas, and the MIMO channels

among base stations are quasi-stable. A deep NN is split

into several segmented NN parts, each of which is deployed

on one speciﬁc base station. The base stations process the

forward computation and backward propagation of the as-

signed NN fragment. For simplicity, unless otherwise stated,

we consider the split ML system with only one splitting point

since it can be easily generalized to multiple-split conditions

by conducting a set of single-split systems. To simplify the

description, we use “transmitter” and “receiver” to refer to

the transmitter and receiver with Ntand Nrantennas in the

forward transmission, respectively.

Assuming that the channel between the transmitter and the

receiver is a quasi-stable MIMO channel with reciprocity,

i.e., the forward channel is H∈ CNr×Nt, and the backward

channel is HT, we do not make any further priori hypotheses

of the channel as the split ML applications may be deployed

in different scenarios. Under the system setting, we assume

that the rank of channel His known to be r, which determines

the maximum amount of dataﬂows transmitted simultaneously

under H. We note that the rank ris mainly determined by the

number of paths, which can be approximately known from

the channel model and the number of scatterers. Therefore, al-

though obtaining the accurate value of ris impossible without

channel estimation, we can get rroughly. Some small singular

values below a certain threshold can be regarded as zeros in the

aspect of engineering. Hence, the rank rcan be underestimated

in some cases. When designing the system, we assume that

ris known exactly, and we later show the cases where there

exists a mismatch numerically in Section V. Besides, to better

realize OAC, we may apply other techniques, such as applying

the orthogonal frequency division multiplexing technique to

realize multiple transmissions simultaneously. However, we

do not consider such techniques since they do not affect the

principles we use in the paper and can be easily combined

with the proposed system.

B. OAC for Split ML

We consider a two-node split ML system, where the original

NN is split into two parts, corresponding to the ﬁrst several

layers and the other layers, which are respectively deployed

on the transmitter and the receiver. A fully connected layer

between the transmitter and the receiver is realized through

the OAC technique. We consider complex NNs [23], which

have shown effectiveness in graph classiﬁcation tasks in the

proposed system. Since most data in wireless communication

is complex, complex NNs are also potential in native wireless

communication learning tasks. In complex NNs, the param-

eters, gradients, as well as intermediate results are complex

numbers, and the backpropagation is almost the same as that

of traditional real NNs. The transmitter can be regarded as a

federation of several synchronized single-antenna transmitters

in MIMO systems. Hence OAC still works similarly with mul-

tiple access systems. We will later discuss how to extend the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1Over-the-AirSplitMachineLearninginWirelessMIMONetworksYuzhiYang,ZhaoyangZhang,YuqingTian,ZhaohuiYang,ChongwenHuang,CaijunZhong,andKai-KitWongAbstractInsplitmachinelearning(ML),differentpartitionsofaneuralnetwork(NN)areexecutedbydifferentcomputingnodes,requiringalargeamountofcommunicationcost.Asove...

展开>> 收起<<

1 Over-the-Air Split Machine Learning in Wireless MIMO Networks.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Over-the-Air Split Machine Learning in Wireless MIMO Networks

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: