1 Deep Learning for Wireless Networked Systems a joint Estimation-Control-Scheduling Approach

2025-04-28 0 0 1.07MB 14 页 10玖币

侵权投诉

Deep Learning for Wireless Networked Systems: a

joint Estimation-Control-Scheduling Approach

Zihuai Zhao, Wanchun Liu*, Member, IEEE, Daniel E. Quevedo, Fellow, IEEE, Yonghui Li, Fellow, IEEE, and

Branka Vucetic Fellow, IEEE

Abstract—Wireless networked control system (WNCS) con-

necting sensors, controllers, and actuators via wireless commu-

nications is a key enabling technology for highly scalable and

low-cost deployment of control systems in the Industry 4.0 era.

Despite the tight interaction of control and communications in

WNCSs, most existing works adopt separative design approaches.

This is mainly because the co-design of control-communication

policies requires large and hybrid state and action spaces, making

the optimal problem mathematically intractable and difﬁcult to

be solved effectively by classic algorithms. In this paper, we

systematically investigate deep learning (DL)-based estimator-

control-scheduler co-design for a model-unknown nonlinear

WNCS over wireless fading channels. In particular, we propose

a co-design framework with the awareness of the sensor’s age-of-

information (AoI) states and dynamic channel states. We propose

a novel deep reinforcement learning (DRL)-based algorithm

for controller and scheduler optimization utilizing both model-

free and model-based data. An AoI-based importance sampling

algorithm that takes into account the data accuracy is proposed

for enhancing learning efﬁciency. We also develop novel schemes

for enhancing the stability of joint training. Extensive experi-

ments demonstrate that the proposed joint training algorithm

can effectively solve the estimation-control-scheduling co-design

problem in various scenarios and provide signiﬁcant performance

gain compared to separative design and some benchmark policies.

Index Terms—Wireless networked control systems, control-

communications co-design, age of information, deep reinforce-

ment learning, task-oriented communications.

I. INTRODUCTION

Under the rapid development of industrial applications in

the Fourth Industrial Revolution, such as smart manufacturing,

smart city, smart grids, e-commerce warehouses and indus-

trial automation systems, wireless networked control system

(WNCS) has been considered as a key solution to the high-

scalable and low-cost deployment of ubiquitous automatic

control systems [1]. A typical WNCS consisting of plants,

sensors, actuators, and a controller is illustrated in Fig. 1. In

the feedback control loop of the WNCS, the sensors measure

plant states and send them to the controller for processing and

generating control signals via uplink channels, which will then

be sent to the actuators for execution via downlink channels.

In principle, the nature of WNCS design is highly inter-

disciplinary, which involves signal processing for plant state

Z. Zhao, W. Liu, Y. Li, and B. Vucetic are with School of Electrical

and Information Engineering, The University of Sydney, Australia. Emails:

{zihuai.zhao, wanchun.liu, yonghui.li, branka.vucetic}@sydney.edu.au. D.

Quevedo is with the School of Electrical Engineering and Robotics,

Queensland University of Technology (QUT), Brisbane, Australia. Email:

dquevedo@ieee.org. W. Liu is the corresponding author.

navigation pick&place

SLAM delivery

Plants Sensors

Actuators

Downlink Channels Uplink Channels

Controller

Fig. 1: A wireless networked control system (WNCS).

estimation, control theory for optimally regulating the plant

behavior, and communication theory for reliably transmitting

the sensor and controller signals under limited communication

resources. Since both estimation and control rely on the infor-

mation delivered by the communication system, the WNCS

design for achieving the optimal control performance should

jointly take into account the estimation, control, and communi-

cation algorithms that tightly interact with each other. Ideally,

those algorithms should be jointly designed to optimize the

control performance of WNCSs under resource constraints.

Although the concept of control-communication co-design

in WNCSs was proposed decades ago (see [1] and references

therein), most related works from different research societies

were built on the separative design principle. The communi-

cations society focuses solely on improving the communica-

tions performance such as data rate, latency, and reliability,

without taking into account the WNCS system dynamics, or

performance [2], [3]. Although in the 5G era, ultra-reliable

low latency communications have been proposed for mission-

critical control applications, the prevailing design principle is

standalone and not tailored to any control applications, where

the control performance is not treated as a design objective [4].

On the other hand, the control system society’s effort on

WNCSs mainly focuses on the control (and estimation) algo-

rithm design with predetermined communications policies (see

[5] and its follow-up works). The recent works on control-

communication co-design for WNCSs can be categorized

into two streams:control-aware communication design and

control-communication policy co-design.

In Stream 1, communication protocols are optimized to

achieve the best control performance or under certain control-

related constraints. In [6], [7], transmission scheduling and

arXiv:2210.00673v1 [eess.SY] 3 Oct 2022

power allocation problems of WNCSs were investigated for

achieving the minimum overall transmission power consump-

tion while guaranteeing certain control performance. In [8],

a control-aware scheduler design problem was considered

based on the communications protocol of IEEE 802.15.4.

In [9], a communication protocol with variable packet length

was proposed and optimized for achieving the best control

performance. In [10], a transmission power allocation problem

of a WNCS with a coding-free communication protocol was

investigated, aiming to achieve optimal overall control perfor-

mance. In [11], a novel framework was developed for jointly

optimizing the communication design parameters to achieve

the best control performance. Note that all those works are

restricted to linear dynamical systems with linear control laws.

For remote state estimation of linear WNCSs, transmission

scheduling problems have drawn signiﬁcant attention. In [12]–

[17], optimal scheduling policies were investigated for various

system setups to minimize average estimation errors.

In Stream 2, both the control and communication policies

are jointly optimized to achieve the overall control perfor-

mance. Stream 2 is more challenging due to the fact that the

joint policy has very large combined state and action spaces

when taking into account both control and communications

domains. In a nutshell, most co-design problems can be formu-

lated as dynamic decision-making ones. However, considering

large state and action spaces, conventional solutions such as

the Markov decision process cannot be applied due to the

curse-of-dimensionality. To solve this issue, most works in this

stream rely on deep-learning (DL) approaches with artiﬁcial

neural networks (NNs) for function approximations. In [18], a

deep reinforcement learning (DRL) approach was adopted to

learn both the control and the transmission scheduling signals.

In particular, DRL combines artiﬁcial NNs with a framework

of reinforcement learning that helps software agents learn

how to solve decision-making problems and reach their goals.

In [19], both the control policy and the dynamic transmission

power allocation policy were jointly optimized based on DRL.

It is worth noting that those DRL-based algorithms are model-

free and are applied to the practical WNCS scenario that does

not require accurate knowledge of the nonlinear system (plant)

models, while the conventional solutions are purely model-

based.

There are still many open problems in the area of control-

communications policy co-design with unknown nonlinear

system models. Many existing works, such as [18], [19],

assume that the sensor measurements are perfect and the

(uplink) communications between sensor-controller are error-

free. Under such an assumption, the controller has an accurate

plant state in real time for generating control signals. When

considering a practical uplink channel, the controller does not

always know the plant state and thus needs state estimation.

This requires estimation-control co-design. A key aspect is

that the estimation quality signiﬁcantly depends on the age

of the sensor’s information available to the estimator, which

measures the time duration since the controller’s last packet

received from the sensor. Due to system dynamics and un-

certainties, a larger age-of-information (AoI) of the sensor

indicates a less reliable state estimate. For real-time control

applications, an estimate with a small AoI is more important

than the one with a large AoI. Such information about the data

importance needs to be taken into account for the controller’s

training. We note that the analysis and optimization of AoI

in different communication networks have drawn signiﬁcant

attention during the past ﬁve years [20]. However, how to

leverage the AoI of sensor data for effectively training

a controller has not been considered before. Furthermore,

when considering DL-based estimator-control-communication

co-design, one needs to systematically design a joint train-

ing algorithm for achieving time and performance efﬁciency,

rather than training the three modules one by one. Otherwise,

the resulting estimation, control and communication policies

may not converge to desired ones, leading to poor overall

control performance of the WNCS. Due to aforementioned

difﬁculties, joint estimator-control-communication policy

learning for WNCSs has not been investigated in the open

literature.

In this work, we systematically investigate a DL-based

estimator-control-scheduler co-design framework for a model-

unknown WNCS with nonlinear dynamic systems. We con-

sider fading channels between sensor-controller and controller-

actuator. The major contributions are summarized as follows.

•We propose a novel DL-based WNCS over fading chan-

nels with time correlations. In particular, the AoI states of

the sensor’s information are utilized in the three modules

of estimator, controller, and scheduler; both the controller

and the scheduler leverage the fading channel states for

decision-making. The instantaneous and historical states

are utilized in each module. Co-design frameworks for

WNCSs with the awareness of AoI and channel states

have not been considered in the open literature.

•We develop a joint estimator-controller-scheduler train-

ing algorithm. In particular, we propose a DRL-based

algorithm for controller and scheduler optimization uti-

lizing both the model-free data that are received from

the sensor directly and the model-based data that are

generated by the estimator, when packet dropout occurs.

An AoI-based importance sampling algorithm that takes

into account the data accuracy is proposed for enhancing

learning efﬁciency. Moreover, we develop novel schemes

for enhancing the stability of joint training.

•Extensive experiments building on the OpenAI Gym

platform demonstrate that the proposed joint training

algorithm can effectively solve the estimation-control-

scheduling co-design problem in various scenarios. Re-

markable performance gains have been achieved com-

pared to the separative design and some benchmark

policies.

Outline: The system model of a general WNCS over fading

channels is described in Section II. The estimation and control

co-design problems of a low-mobility and a high-mobility

WNCS were investigated in Sections III and IV, respectively.

The numerical results are demonstrated and discussed in

Section V, followed by conclusions in Section VI.

Critic



Intelligent

Controller

 



  

     

Plant Sensor



Actuator



Rx

Static Channels

Histories

Controller History Estimator History

Actor

 











 



Estimator













 

   



AoI Counter

Remote Controller





 

 





 

 



Fig. 2: Estimator-controller co-design of the low-mobility WNCS.

II. SYSTEM MODEL

A. WNCS Model

We consider a wireless networked control system as shown

in Fig. 2. The plant is a discrete-time nonlinear system as

st+1 =f(st, ut) + νt(1)

ot=st+vt(2)

where st∈Rnsand ut∈Rnuare the plant state and the

control input from the actuator at time t, respectively. In

particular, the nonlinear dynamics f:Rns×Rnu→Rns

is unknown to the remote controller, and νtis the plant

disturbance. ot∈Rnsis the sensor measurement of the plant

state staffected by the measurement noise vt∈Rns.

We model the uplink channel and the downlink channel as

m-state Markov fading channels [21]. The channel states of

the uplink and the downlink are denoted by bU

t∈ WU,

{wU

1, . . . , wU

m}and bD

t∈ WD,{wD

1, . . . , wD

m}, respectively.

Let pU

i,j and pD

i,j denote the channel state transition probabili-

ties from state ito jof the uplink channel and the downlink

channel, respectively, i.e.,

i,j ,Prob[bU

t+1 =wU

j|bU

t=wU

i],

i,j ,Prob[bD

t+1 =wD

j|bD

t=wD

i].(3)

Then, the channel state transition probability matrices of the

channels are

MU,





1,1. . . pU

m,1

.....

1,m . . . pU

m,m





(4)

and

MD,





1,1. . . pD

m,1

.....

1,m . . . pD

m,m





.(5)

We assume that the instantaneous channel states, i.e., bU

tand

tare known to the controller by classical channel estimation

schemes [22], while the dynamic channel models, i.e., MU

and MDare not available.

Let the binary variables ρU

t∈ {0,1}and ρD

t∈ {0,1}denote

transmission failure and success of the uplink channel and

the downlink channel at time t, respectively. The packet error

probabilities at different channel states are

i,Prob[ρU

t= 0|bU

t=wU

i],∀i∈ {1, . . . , m}(6)

and

i,Prob[ρD

t= 0|bD

t=wD

i],∀i∈ {1, . . . , m}.(7)

We assume that the actuator sends the one-bit acknowledge

information ρD

tto the controller via a perfect feedback

channel. This is a widely adopted assumption in wireless

communications.

B. Control and transmission schedule

We consider both a low-mobility scenario (e.g., process

control systems in factories) and a high-mobility scenario

(e.g., unmanned aerial vehicles) of the WNCS. In the former

scenario, the channel coherence time is much longer than each

control time slot, and thus channel state is static. We note

that the system is stochastic in this scenario. Therefore, the

Markov fading channels in (4) and (5) degrade to additive

white Gaussian noise (AWGN) channels with constant channel

states (i.e., m= 1 in (6) and (7)). Due to the low mobility,

sensors are often able to be connected to power grids, and

the transmission power consumption is not a major concern.

For the latter scenario, sensors are commonly powered by

batteries. Due to the costly battery replacement operations, it

is of signiﬁcant interest to reduce the uplink transmission rate

while guaranteeing a certain level of desired control quality.

Therefore, an uplink transmission scheduler implemented at

the controller will schedule the sensor’s transmissions only

when it is necessary. Let aTx

t∈ {0,1}denote the scheduling

action and S(·)denote the scheduling function mapping from

input states to aTx

t. We will discuss the input states in the

following section.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1DeepLearningforWirelessNetworkedSystems:ajointEstimation-Control-SchedulingApproachZihuaiZhao,WanchunLiu*,Member,IEEE,DanielE.Quevedo,Fellow,IEEE,YonghuiLi,Fellow,IEEE,andBrankaVuceticFellow,IEEEAbstractWirelessnetworkedcontrolsystem(WNCS)con-nectingsensors,controllers,andactuatorsviawirelesscommu...

展开>> 收起<<

1 Deep Learning for Wireless Networked Systems a joint Estimation-Control-Scheduling Approach.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Deep Learning for Wireless Networked Systems a joint Estimation-Control-Scheduling Approach

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: