Graph Reinforcement Learning-based CNN Inference Ofﬂoading in Dynamic Edge Computing Nan Li Alexandros Iosiﬁdis and Qi Zhang

2025-05-06 0 0 1.72MB 6 页 10玖币

侵权投诉

Graph Reinforcement Learning-based CNN

Inference Ofﬂoading in Dynamic Edge Computing

Nan Li, Alexandros Iosiﬁdis and Qi Zhang

DIGIT, Department of Electrical and Computer Engineering, Aarhus University.

Email: {linan, ai, qz}@ece.au.dk

Abstract—This paper studies the computational ofﬂoading of

CNN inference in dynamic multi-access edge computing (MEC)

networks. To address the uncertainties in communication time

and Edge servers’ available capacity, we use early-exit mechanism

to terminate the computation earlier to meet the deadline of

inference tasks. We design a reward function to trade off the

communication, computation and inference accuracy, and formu-

late the ofﬂoading problem of CNN inference as a maximization

problem with the goal of maximizing the average inference

accuracy and throughput in long term. To solve the maxi-

mization problem, we propose a graph reinforcement learning-

based early-exit mechanism (GRLE), which outperforms the

state-of-the-art work, deep reinforcement learning-based online

ofﬂoading (DROO) and its enhanced method, DROO with early-

exit mechanism (DROOE), under different dynamic scenarios.

The experimental results show that GRLE achieves the average

accuracy up to 3.41×over graph reinforcement learning (GRL)

and 1.45×over DROOE, which shows the advantages of GRLE

for ofﬂoading decision-making in dynamic MEC.

Index Terms—Dynamic computation ofﬂoading, CNN infer-

ence, Graph reinforcement learning, Edge computing, Service

reliability

I. INTRODUCTION

The advancement in convolutional neural networks (CNNs)

has propelled various emerging CNN-based IoT applications,

such as autonomous driving and augmented reality. For time-

critical IoT applications, not only high reliability but also

low latency is crucial to enable reliable services and safe

intelligent control [1]. To achieve higher inference accuracy,

deeper CNNs with massive multiply-accumulate operations are

often required. However, resource-limited IoT devices are not

feasible to complete computational intensive CNNs within a

stringent deadline [2].

Edge computing is a paradigm that allows IoT devices to

ofﬂoad computational tasks to their nearby edge servers (ESs)

via wireless channels [1], [3]. However, the stochastic wireless

channel states may cause ﬂuctuations in communication time,

which consequently results in that the exact task completion

time is unknown in advance and varies over time [4]. To

address the uncertainties in communication time, dynamic

ofﬂoading methods are proposed to strike a balance between

the communication and computation [5]–[8]. However, when

communication takes too much time or the available computa-

tion resource at ESs is insufﬁcient, it cannot meet the stringent

deadline if running an inference task through the entire pre-

trained CNN model (e.g., until the end of the main branch of

Early-exit 1 Early-exit 2

Main

branch

Convolutional layer

Fully-connected layer

Fig. 1: Early-exit mechanism of CNN inference

a CNN). To address this issue, dynamic inference methods [9]

are promising to meet the latency requirements through mod-

iﬁcation of CNN architecture and allowing dynamic inference

time at the compromise of inference accuracy.

Early-exit mechanism [10] is a dynamic inference method,

which terminates the inference at an early stage, as shown

in Fig. 1. Unlike the conventional CNN inference, early-exit

architecture enables the intermediate classiﬁers of a CNN to

provide different accuracy-latency trade-offs along its depth.

The goal of early-exit architecture is to provide adaptive

accuracy-latency behavior so that each inference task termi-

nates at an appropriate exit based on its computation time

budget. In other words, in case of poor channel states, an

early-exit can be used to terminate the computation earlier,

thereby providing the inference result within the deadline of

time critical applications at the expense of slightly lower ac-

curacy, instead of completely missing the deadline. However,

how to design an efﬁcient task ofﬂoading scheme to trade

off communication, computation and inference accuracy in

dynamic multi-access edge computing (MEC) is a challenging

problem and has not yet been adequately addressed.

Based on the motivation above, this paper studies the

ofﬂoading problem of CNN inference tasks in dynamic MEC

networks. Our contributions are summarized as follows.

•We use early-exit CNN architecture to provide dynamic

inference to address the issue caused by stochastic avail-

able computing resource of ESs and the uncertainties in

communication time, thereby ensuring that inference task

is completed within time constraints.

•We deﬁne a reward function to strike a balance among

the communication, computation and inference accuracy.

Then we model the CNN inference ofﬂoading problem

as a maximization problem to maximize the average

inference accuracy and throughput in long term.

•We design a graph reinforcement learning-based early-

exit mechanism (GRLE) to make optimal ofﬂoading deci-

sions. The experimental results show that GRLE achieves

arXiv:2210.13464v1 [cs.LG] 24 Oct 2022

work, deep reinforcement learning-based online ofﬂoad-

ing (DROO) [8] and its enhanced method, DROO with

early-exit (DROOE), under different dynamic scenarios.

In our experiments, GRLE achieves average accuracy up

to 3.41×over graph reinforcement learning (GRL) and

1.45×over DROOE, which demonstrates that GRLE is

effective to make ofﬂoading decision in dynamic MEC.

II. RELATED WORK

Due to the characteristics of time-varying wireless channels,

it is crucial to make effective ofﬂoad decisions to ensure the

QoS in edge computing paradigms. Guo et al. [5] proposed

a heuristic search-based energy-efﬁcient dynamic ofﬂoading

scheme to minimize energy consumption and completion

time of task with strict deadline. Tran et al. [6] designed

a heuristic search method to optimize task ofﬂoading and

resource allocation to jointly minimize task completion time

and energy consumption by iteratively adjusting the ofﬂoading

decision. However, heuristic search methods require accurate

input information, which is not applicable to dynamic MEC.

Reinforcement learning (RL) is a holistic learning paradigm

that interacts with the dynamic MEC to maximize long-

term rewards. Li et al. [7] proposed to use deep RL (DRL)-

based optimization methods to address dynamic computational

ofﬂoading problem. However, applying DRL directly to the

problem is inefﬁcient in a practical deployment because of-

ﬂoading algorithms typically require many iterations to search

an effective strategy for unseen scenarios. Huang et al. [8]

proposed DROO to signiﬁcantly improve the convergence

speed through efﬁcient scaling strategies and direct learning of

ofﬂoading decisions. However, the DNN used in DROO can

only handle Euclidean data, which makes it not well suitable

for the graph-like structure data of MEC. In addition, all

the above methods do not provide dynamic inference, which

is lack of ﬂexibility in making good use of any available

computation resource under stringent latency.

III. SYSTEM MODEL

This paper studies the computation ofﬂoading in an MEC

network, which consists of MIoT devices and NESs. The

set of IoT devices and ESs are denoted as M={1,2,· · · , M}

and N={1,2,· · · , N}respectively. At each time slot k∈K,

each IoT device generates a computational task that has to

be processed within its deadline, where K={1,2,· · · , K}.

The length of each time slot kis assumed to be τ. The

computational task of IoT device is assumed to use a CNN

with Lconvolutional layers (CLs), which are denoted as

L={1,2,· · · , L}. Each IoT device can ofﬂoad its inference

task to one ES through wireless channel and each ES can serve

multiple IoT devices at each time slot. In the case of poor

wireless channel state or insufﬁcient available computation

resource of ESs, ESs can use the early-exit mechanism to

terminate the computation earlier to meet the deadline of an

inference task. After completing inference task, ES will send

back the results to IoT devices. To perform computational

ofﬂoading, two decisions should be considered: (1) to which

ES an IoT device should ofﬂoad its tasks; (2) to which early-

exit the ES can perform the task based on a time constraint.

A. Communication time

Computation ofﬂoading involves delivering of task data and

its inference result between IoT device and ES. We assume that

a task of IoT device mis ofﬂoaded to ES nat time slot k. The

task information is expressed as θk

m,n =dk

m,n, δk

m,n, rk

m,n,

where dk

m,n and δk

m,n are the task size and latency requirement

respectively, and rk

m,n is the uplink transmission data rate from

IoT device mto ES n. Therefore, the transmission delay of

an inference task from IoT device mto ES nis denoted as

tcom

m,n (k) = αk

m,ndk

m,n/rk

m,n (1)

where αk

m,n ∈ {0,1}is a binary variable to indicate whether

the task of device mis ofﬂoaded to ES nat time slot k. As

each inference task can only be ofﬂoaded to one ES, there is

n∈N

αk

m,n = 1,∀m∈M.(2)

In addition, as the output of CNN is very small, often a

number or value that represents the classiﬁcation or detection

result, transmission delay of the feedback is negligible.

B. Computation time

During computation, an ES can only select an early-exit to

perform inference for each ofﬂoaded task. As we use a binary

variable βk

m,n,l ∈ {0,1}to denote if ES nperforms device

m’s task until early-exit l, there is

l∈L

βk

m,n,l = 1,∀m∈Mand n∈N.(3)

Assuming that ES nperforms an inference task until early-

exit l, the computation time and inference accuracy are de-

noted as tcmp

n,l and φlrespectively. Therefore, the computation

time of m’s task on ES nis expressed as

tcmp

m,n(k) = βk

m,n,ltcmp

n,l .(4)

Correspondingly, the achieved inference accuracy of the task

generated by IoT device mat time slot kis

Φm,n(k) = βk

m,n,lφl.(5)

We assume each ES processes inference tasks on a ﬁrst-

come-ﬁrst-served basis. Namely, ES ncan start to process a

new arrival task only if it has processed all the previously ar-

rived tasks. Assume that device m’s task generated at time slot

kis ofﬂoaded to ES n, device mcan start transmission of this

task only after completing transmission of its previous tasks.

Since the propagation time is negligible, the task generated

by device mat time slot karrives at ES nat time instant,

m,n (k), which can be expressed as

m,n

(k)=tcom

m,n (k)k=1,

maxTa

m,n0(k−1),(k−1)τ+tcom

m,n(k)k6=1, n0∈N.

(6)

Correspondingly, the waiting time at ES nof device m’s task

generated at time slot k,tw

m,n (k), can be calculated as (7),

where 1(·)is an indicator function to show the occurrence of

device m0’s task arriving at ES nbefore device m’s task.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

GraphReinforcementLearning-basedCNNInferenceOfoadinginDynamicEdgeComputingNanLi,AlexandrosIosidisandQiZhangDIGIT,DepartmentofElectricalandComputerEngineering,AarhusUniversity.Email:flinan,ai,qzg@ece.au.dkAbstractThispaperstudiesthecomputationalofoadingofCNNinferenceindynamicmulti-accessedgecompu...

展开>> 收起<<

Graph Reinforcement Learning-based CNN Inference Ofﬂoading in Dynamic Edge Computing Nan Li Alexandros Iosiﬁdis and Qi Zhang.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Graph Reinforcement Learning-based CNN Inference Ofﬂoading in Dynamic Edge Computing Nan Li Alexandros Iosiﬁdis and Qi Zhang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: