Over-the-Air Gaussian Process Regression Based on Product of Experts Koya Sato

2025-05-06 0 0 1.71MB 7 页 10玖币
侵权投诉
Over-the-Air Gaussian Process Regression
Based on Product of Experts
Koya Sato
Artificial Intelligence eXploration Research Center,
The University of Electro-Communications, 1-5-1, Chofugaoka, Chofu-shi, Tokyo, Japan
E-mail: k sato@ieee.org
Abstract—This paper proposes a distributed Gaussian process
regression (GPR) with over-the-air computation, termed AirComp
GPR, for communication- and computation-efficient data analysis
over wireless networks. GPR is a non-parametric regression
method that can model the target flexibly. However, its com-
putational complexity and communication efficiency tend to be
significant as the number of data increases. AirComp GPR
focuses on that product-of-experts-based GPR approximates the
exact GPR by a sum of values reported from distributed nodes.
We introduce AirComp for the training and prediction steps to
allow the nodes to transmit their local computation results simul-
taneously; the communication strategies are presented, including
distributed training based on perfect and statistical channel state
information cases. Applying to a radio map construction task, we
demonstrate that AirComp GPR speeds up the computation time
while maintaining the communication cost in training constant
regardless of the numbers of data and nodes.
Index Terms—Over-the-air computation, distributed machine
learning, Gaussian processes, radio map construction
I. INTRODUCTION
Gaussian process regression (GPR) is a non-parametric
approach to regression tasks, which realizes flexible modeling
of a dataset without specifying low-level assumptions [1], [2].
Assuming GP for the target data, we can obtain both the mean
and variance of the regression results. There has been a wide
range of applications for GPR such as environmental monitor-
ing based on spatial statistics [3], [4], experimental design [5]
and motion trajectory analysis [6]; in wireless communication
systems, recent results have shown its advances in coverage
analysis and communication design, with the term of radio
map [7]–[9]. GPR will play an important role in the next
Internet of Things (IoT) era.
However, GPR has some critical drawbacks regarding com-
munication and computational costs in such applications. Let
us consider a situation where multiple nodes are distributed on
a network to monitor an environmental state and connected to
a server wirelessly, as envisioned in [10]. When the server
performs GPR to analyze the sensing results, the nodes need
to upload their sensing data to the server. The exact GPR
requires inverse matrices in training and prediction steps. This
leads the complexity of O(N3)for Ntraining data; further,
for nin input dimension data, the nodes upload (nin + 1)N
This work was supported in part by JST, ACT-X, JPMJAX21AA and JST
SICORP, JPMJSC20C1.
variables to the server. The first problem can be improved by
distributed GPR based on the product of experts [2], [11].
This method approximates GPR by the sum of computation
results at nodes to reduce the computational complexity from
O(N3)at the server to O((N/M)3)at Mdistributed nodes;
however, the communication slots still depend on M.
In this paper, toward a communication- and computation-
efficient IoT monitoring system, we propose a distributed GPR
scheme with over-the-air computation, termed AirComp GPR.
Over-the-air computation is a technique for communication-
efficient distributed computation over shared channels based
on nomographic functions [12], [13]. Each node transmits
its message with an analog modulation function. Then, the
receiver obtains the target computation result from the super-
imposed signal based on a decoding function. Since multiple
nodes transmit their analog-modulated signals simultaneously,
we can realize a low-latency computation over networks. We
focus on that both training/regression results in the distributed
GPR are based on the sum of computation results reported
from the nodes. The proposed method aggregates the local
computation results based on the over-the-air computation; as
a result, the communication cost does not depend on the data
size and the number of nodes.
Major contributions of this paper are listed as follows.
We propose AirComp-aided distributed GPR for com-
munication/computation efficient regression over wireless
networks. It is shown that the computational complexity
can be reduced from O(N3)at BS to O((N/M)3)at
Mdistributed nodes, and its communication cost at the
training step can be constant regardless of Mand N.
Two schemes are introduced for the training step: per-
fect channel state information (CSI)-based and statistical
CSI-based schemes. The first approach can perform the
distributed GPR with a limited accuracy degradation from
full GPR; further, the latter enables no requirements for
the uplink instantaneous channel estimations.
Performance of AirComp GPR is analyzed in the radio
map construction task. We demonstrate that an accurate
radio map can be constructed efficiently.
Notations: throughout this paper, the transpose, determinant,
and inverse operators are denoted by (·)T,det(·)and (·)1,
while the expectation and the variance are expressed by E[·]
arXiv:2210.02204v2 [eess.SP] 6 Oct 2022
...
Result of local training or regression
Fig. 1. Signal transmission model.
and Var[·], respectively. Further, |·|and || · || are defined
as operators to obtain the absolute and Euclidean distance,
respectively.
II. SYSTEM MODEL
A. Task Definition
We consider a situation where Msensing nodes are con-
nected to a base station (BS) over wireless networks. The i-th
node has a dataset,
Di={(xi,k, yi,k )|k= 1,2,··· , Ni},(1)
where Niis the number of data, xi,k is the input vector
(e.g., sensing location) and yi,k =f(xi,k) + is its output
value generated from N(f(xi,k), σ2
)(∼ N(0, σ2
)is the
independently and identically distributed (i.i.d.) noise). When
local datasets are non-overlapped each other, the full dataset
over the network can be expressed as
D=
M
[
i=1 Di,(2)
where the number of full data can be defined as N=
PM
i=1 Ni. Further, it is assumed that all data in Dfollows a
Gaussian process: i.e., fGP (µ(x), k(x,x0)), where µ(x)
is the expectation value at x, and k(x,x0)is the covariance
between µ(x)and µ(x0). The task in this context is to estimate
ffor test inputs X= [x,1,x,2,··· ,x,ntest ]from Di
distributedly.
Possible applications of the above task include environmen-
tal monitoring [14] and radio map construction [7].
B. Signal Model
AirComp GPR can be divided into training and regression
steps. We herein define the signal model for these steps. Fig. 1
summarizes the signal transmission model, where all nodes
simultaneously transmit their messages to BS through a shared
wireless channel. The i-th node first encodes its message si
so that BS can extract the sum of si; we denote this process
as xi= Enc(si). When all nodes are time synchronized, the
received signal at BS can be given by
y=
M
X
i=1 pγihixi+z,(3)
where γiRis the average channel gain and hi
CN(0,1) is the i.i.d. instantaneous channel gain assuming flat
over one transmission. Further, xiis the transmitted vector
constrained by the maximum transmission power Pmax as
||xi||2Pmax, and zis the additive white Gaussian noise
(AWGN) vector following CN(0, σ2
z), where σ2
zis the noise
floor. Then, BS extract the sum of siusing a decoding
operation, defined by Dec(y).
Note that BS has to share a message that contains a
few hyper-parameters in the training step to the nodes. To
enable the channel estimation at the nodes, BS broadcasts
it with digital encoding with sufficient transmission power;
we assume that the nodes can decode it correctly. Further,
assuming channel reciprocity, the nodes can estimate the
instantaneous channel state information (CSI) γihiowing
to the broadcasted downlink signals. In contrast, we consider
two situations for BS; (a) global CSI and (b) statistical CSI
(i.e., only γiis available). This condition affects the AirComp
in the training step (see IV-A).
III. GAUSSIAN PROCESS REGRESSION
Before explaining the proposed method, this section intro-
duces full GPR, and its distributed method based on products
of experts [2]. Note that, for simplicity, this section assumes
ntest = 1 and denotes the test input as x.
A. Full GPR
Consider a situation where BS has the full dataset Dand
performs the exact GPR. From the full dataset D, we define
y={yi,k | ∀(i, k)}and X={xi,k | ∀(i, k)}. GPR first
needs to tune hyper-parameters θ={ψ, σ}, where ψis the
hyper-parameter vector for a kernel function k. Finding θcan
be realized by maximizing the log-marginal likelihood,
log p(y|X,θ) = 1
2(ym)TK+σ2
I1(ym)
1
2log det K+σ2
IN
2log 2π, (4)
where Iis the N×Nidentity matrix and KRN×Nis the
kernel matrix, where its element is Kij =k(xi,xj)(xiis the
i-th element in X). Further, mis a vector with Nelements,
where its i-th element m(xi)is the prior mean at xi1.
Based on a vector θ, the full GPR predicts the distribution
of the output at the test input xas the Gaussian distribution
with mean (E[f(x)] = µ(x)) and variance (Var[f(x)] =
σ2(x)) given by the following equations, respectively:
µ(x) = m(x) + kT
K+σ2
I1(ym)(5)
σ2(x) = k∗∗ kT
K+σ2
I1k,(6)
1For example, vector mis given from m(x1) = m(x2) = ··· =
m(xN) = 1
NPN
i=1 yi, where yiis the i-th element in y.
摘要:

Over-the-AirGaussianProcessRegressionBasedonProductofExpertsKoyaSatoArticialIntelligenceeXplorationResearchCenter,TheUniversityofElectro-Communications,1-5-1,Chofugaoka,Chofu-shi,Tokyo,JapanE-mail:ksato@ieee.orgAbstract—ThispaperproposesadistributedGaussianprocessregression(GPR)withover-the-aircomp...

展开>> 收起<<
Over-the-Air Gaussian Process Regression Based on Product of Experts Koya Sato.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:1.71MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注