Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing Nan Li Alexandros Iosifidis and Qi Zhang

2025-05-06 0 0 1.72MB 6 页 10玖币
侵权投诉
Graph Reinforcement Learning-based CNN
Inference Offloading in Dynamic Edge Computing
Nan Li, Alexandros Iosifidis and Qi Zhang
DIGIT, Department of Electrical and Computer Engineering, Aarhus University.
Email: {linan, ai, qz}@ece.au.dk
Abstract—This paper studies the computational offloading of
CNN inference in dynamic multi-access edge computing (MEC)
networks. To address the uncertainties in communication time
and Edge servers’ available capacity, we use early-exit mechanism
to terminate the computation earlier to meet the deadline of
inference tasks. We design a reward function to trade off the
communication, computation and inference accuracy, and formu-
late the offloading problem of CNN inference as a maximization
problem with the goal of maximizing the average inference
accuracy and throughput in long term. To solve the maxi-
mization problem, we propose a graph reinforcement learning-
based early-exit mechanism (GRLE), which outperforms the
state-of-the-art work, deep reinforcement learning-based online
offloading (DROO) and its enhanced method, DROO with early-
exit mechanism (DROOE), under different dynamic scenarios.
The experimental results show that GRLE achieves the average
accuracy up to 3.41×over graph reinforcement learning (GRL)
and 1.45×over DROOE, which shows the advantages of GRLE
for offloading decision-making in dynamic MEC.
Index Terms—Dynamic computation offloading, CNN infer-
ence, Graph reinforcement learning, Edge computing, Service
reliability
I. INTRODUCTION
The advancement in convolutional neural networks (CNNs)
has propelled various emerging CNN-based IoT applications,
such as autonomous driving and augmented reality. For time-
critical IoT applications, not only high reliability but also
low latency is crucial to enable reliable services and safe
intelligent control [1]. To achieve higher inference accuracy,
deeper CNNs with massive multiply-accumulate operations are
often required. However, resource-limited IoT devices are not
feasible to complete computational intensive CNNs within a
stringent deadline [2].
Edge computing is a paradigm that allows IoT devices to
offload computational tasks to their nearby edge servers (ESs)
via wireless channels [1], [3]. However, the stochastic wireless
channel states may cause fluctuations in communication time,
which consequently results in that the exact task completion
time is unknown in advance and varies over time [4]. To
address the uncertainties in communication time, dynamic
offloading methods are proposed to strike a balance between
the communication and computation [5]–[8]. However, when
communication takes too much time or the available computa-
tion resource at ESs is insufficient, it cannot meet the stringent
deadline if running an inference task through the entire pre-
trained CNN model (e.g., until the end of the main branch of
Early-exit 1 Early-exit 2
Main
branch
Convolutional layer
Fully-connected layer
Fig. 1: Early-exit mechanism of CNN inference
a CNN). To address this issue, dynamic inference methods [9]
are promising to meet the latency requirements through mod-
ification of CNN architecture and allowing dynamic inference
time at the compromise of inference accuracy.
Early-exit mechanism [10] is a dynamic inference method,
which terminates the inference at an early stage, as shown
in Fig. 1. Unlike the conventional CNN inference, early-exit
architecture enables the intermediate classifiers of a CNN to
provide different accuracy-latency trade-offs along its depth.
The goal of early-exit architecture is to provide adaptive
accuracy-latency behavior so that each inference task termi-
nates at an appropriate exit based on its computation time
budget. In other words, in case of poor channel states, an
early-exit can be used to terminate the computation earlier,
thereby providing the inference result within the deadline of
time critical applications at the expense of slightly lower ac-
curacy, instead of completely missing the deadline. However,
how to design an efficient task offloading scheme to trade
off communication, computation and inference accuracy in
dynamic multi-access edge computing (MEC) is a challenging
problem and has not yet been adequately addressed.
Based on the motivation above, this paper studies the
offloading problem of CNN inference tasks in dynamic MEC
networks. Our contributions are summarized as follows.
We use early-exit CNN architecture to provide dynamic
inference to address the issue caused by stochastic avail-
able computing resource of ESs and the uncertainties in
communication time, thereby ensuring that inference task
is completed within time constraints.
We define a reward function to strike a balance among
the communication, computation and inference accuracy.
Then we model the CNN inference offloading problem
as a maximization problem to maximize the average
inference accuracy and throughput in long term.
We design a graph reinforcement learning-based early-
exit mechanism (GRLE) to make optimal offloading deci-
sions. The experimental results show that GRLE achieves
better performance in comparison with the state-of-art978-1-6654-3540-6/22 © 2022 IEEE
arXiv:2210.13464v1 [cs.LG] 24 Oct 2022
work, deep reinforcement learning-based online offload-
ing (DROO) [8] and its enhanced method, DROO with
early-exit (DROOE), under different dynamic scenarios.
In our experiments, GRLE achieves average accuracy up
to 3.41×over graph reinforcement learning (GRL) and
1.45×over DROOE, which demonstrates that GRLE is
effective to make offloading decision in dynamic MEC.
II. RELATED WORK
Due to the characteristics of time-varying wireless channels,
it is crucial to make effective offload decisions to ensure the
QoS in edge computing paradigms. Guo et al. [5] proposed
a heuristic search-based energy-efficient dynamic offloading
scheme to minimize energy consumption and completion
time of task with strict deadline. Tran et al. [6] designed
a heuristic search method to optimize task offloading and
resource allocation to jointly minimize task completion time
and energy consumption by iteratively adjusting the offloading
decision. However, heuristic search methods require accurate
input information, which is not applicable to dynamic MEC.
Reinforcement learning (RL) is a holistic learning paradigm
that interacts with the dynamic MEC to maximize long-
term rewards. Li et al. [7] proposed to use deep RL (DRL)-
based optimization methods to address dynamic computational
offloading problem. However, applying DRL directly to the
problem is inefficient in a practical deployment because of-
floading algorithms typically require many iterations to search
an effective strategy for unseen scenarios. Huang et al. [8]
proposed DROO to significantly improve the convergence
speed through efficient scaling strategies and direct learning of
offloading decisions. However, the DNN used in DROO can
only handle Euclidean data, which makes it not well suitable
for the graph-like structure data of MEC. In addition, all
the above methods do not provide dynamic inference, which
is lack of flexibility in making good use of any available
computation resource under stringent latency.
III. SYSTEM MODEL
This paper studies the computation offloading in an MEC
network, which consists of MIoT devices and NESs. The
set of IoT devices and ESs are denoted as M={1,2,· · · , M}
and N={1,2,· · · , N}respectively. At each time slot kK,
each IoT device generates a computational task that has to
be processed within its deadline, where K={1,2,· · · , K}.
The length of each time slot kis assumed to be τ. The
computational task of IoT device is assumed to use a CNN
with Lconvolutional layers (CLs), which are denoted as
L={1,2,· · · , L}. Each IoT device can offload its inference
task to one ES through wireless channel and each ES can serve
multiple IoT devices at each time slot. In the case of poor
wireless channel state or insufficient available computation
resource of ESs, ESs can use the early-exit mechanism to
terminate the computation earlier to meet the deadline of an
inference task. After completing inference task, ES will send
back the results to IoT devices. To perform computational
offloading, two decisions should be considered: (1) to which
ES an IoT device should offload its tasks; (2) to which early-
exit the ES can perform the task based on a time constraint.
A. Communication time
Computation offloading involves delivering of task data and
its inference result between IoT device and ES. We assume that
a task of IoT device mis offloaded to ES nat time slot k. The
task information is expressed as θk
m,n =dk
m,n, δk
m,n, rk
m,n,
where dk
m,n and δk
m,n are the task size and latency requirement
respectively, and rk
m,n is the uplink transmission data rate from
IoT device mto ES n. Therefore, the transmission delay of
an inference task from IoT device mto ES nis denoted as
tcom
m,n (k) = αk
m,ndk
m,n/rk
m,n (1)
where αk
m,n ∈ {0,1}is a binary variable to indicate whether
the task of device mis offloaded to ES nat time slot k. As
each inference task can only be offloaded to one ES, there is
X
nN
αk
m,n = 1,mM.(2)
In addition, as the output of CNN is very small, often a
number or value that represents the classification or detection
result, transmission delay of the feedback is negligible.
B. Computation time
During computation, an ES can only select an early-exit to
perform inference for each offloaded task. As we use a binary
variable βk
m,n,l ∈ {0,1}to denote if ES nperforms device
ms task until early-exit l, there is
X
lL
βk
m,n,l = 1,mMand nN.(3)
Assuming that ES nperforms an inference task until early-
exit l, the computation time and inference accuracy are de-
noted as tcmp
n,l and φlrespectively. Therefore, the computation
time of ms task on ES nis expressed as
tcmp
m,n(k) = βk
m,n,ltcmp
n,l .(4)
Correspondingly, the achieved inference accuracy of the task
generated by IoT device mat time slot kis
Φm,n(k) = βk
m,n,lφl.(5)
We assume each ES processes inference tasks on a first-
come-first-served basis. Namely, ES ncan start to process a
new arrival task only if it has processed all the previously ar-
rived tasks. Assume that device ms task generated at time slot
kis offloaded to ES n, device mcan start transmission of this
task only after completing transmission of its previous tasks.
Since the propagation time is negligible, the task generated
by device mat time slot karrives at ES nat time instant,
Ta
m,n (k), which can be expressed as
Ta
m,n
(k)=tcom
m,n (k)k=1,
maxTa
m,n0(k1),(k1)τ+tcom
m,n(k)k6=1, n0N.
(6)
Correspondingly, the waiting time at ES nof device ms task
generated at time slot k,tw
m,n (k), can be calculated as (7),
where 1(·)is an indicator function to show the occurrence of
device m0s task arriving at ES nbefore device ms task.
摘要:

GraphReinforcementLearning-basedCNNInferenceOfoadinginDynamicEdgeComputingNanLi,AlexandrosIosidisandQiZhangDIGIT,DepartmentofElectricalandComputerEngineering,AarhusUniversity.Email:flinan,ai,qzg@ece.au.dkAbstract—ThispaperstudiesthecomputationalofoadingofCNNinferenceindynamicmulti-accessedgecompu...

展开>> 收起<<
Graph Reinforcement Learning-based CNN Inference Offloading in Dynamic Edge Computing Nan Li Alexandros Iosifidis and Qi Zhang.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:6 页 大小:1.72MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注