1 Deep Learning for Wireless Networked Systems a joint Estimation-Control-Scheduling Approach

2025-04-28 0 0 1.07MB 14 页 10玖币
侵权投诉
1
Deep Learning for Wireless Networked Systems: a
joint Estimation-Control-Scheduling Approach
Zihuai Zhao, Wanchun Liu*, Member, IEEE, Daniel E. Quevedo, Fellow, IEEE, Yonghui Li, Fellow, IEEE, and
Branka Vucetic Fellow, IEEE
Abstract—Wireless networked control system (WNCS) con-
necting sensors, controllers, and actuators via wireless commu-
nications is a key enabling technology for highly scalable and
low-cost deployment of control systems in the Industry 4.0 era.
Despite the tight interaction of control and communications in
WNCSs, most existing works adopt separative design approaches.
This is mainly because the co-design of control-communication
policies requires large and hybrid state and action spaces, making
the optimal problem mathematically intractable and difficult to
be solved effectively by classic algorithms. In this paper, we
systematically investigate deep learning (DL)-based estimator-
control-scheduler co-design for a model-unknown nonlinear
WNCS over wireless fading channels. In particular, we propose
a co-design framework with the awareness of the sensor’s age-of-
information (AoI) states and dynamic channel states. We propose
a novel deep reinforcement learning (DRL)-based algorithm
for controller and scheduler optimization utilizing both model-
free and model-based data. An AoI-based importance sampling
algorithm that takes into account the data accuracy is proposed
for enhancing learning efficiency. We also develop novel schemes
for enhancing the stability of joint training. Extensive experi-
ments demonstrate that the proposed joint training algorithm
can effectively solve the estimation-control-scheduling co-design
problem in various scenarios and provide significant performance
gain compared to separative design and some benchmark policies.
Index Terms—Wireless networked control systems, control-
communications co-design, age of information, deep reinforce-
ment learning, task-oriented communications.
I. INTRODUCTION
Under the rapid development of industrial applications in
the Fourth Industrial Revolution, such as smart manufacturing,
smart city, smart grids, e-commerce warehouses and indus-
trial automation systems, wireless networked control system
(WNCS) has been considered as a key solution to the high-
scalable and low-cost deployment of ubiquitous automatic
control systems [1]. A typical WNCS consisting of plants,
sensors, actuators, and a controller is illustrated in Fig. 1. In
the feedback control loop of the WNCS, the sensors measure
plant states and send them to the controller for processing and
generating control signals via uplink channels, which will then
be sent to the actuators for execution via downlink channels.
In principle, the nature of WNCS design is highly inter-
disciplinary, which involves signal processing for plant state
Z. Zhao, W. Liu, Y. Li, and B. Vucetic are with School of Electrical
and Information Engineering, The University of Sydney, Australia. Emails:
{zihuai.zhao, wanchun.liu, yonghui.li, branka.vucetic}@sydney.edu.au. D.
Quevedo is with the School of Electrical Engineering and Robotics,
Queensland University of Technology (QUT), Brisbane, Australia. Email:
dquevedo@ieee.org. W. Liu is the corresponding author.
navigation pick&place
SLAM delivery
Plants Sensors
Actuators
Downlink Channels Uplink Channels
Controller
Fig. 1: A wireless networked control system (WNCS).
estimation, control theory for optimally regulating the plant
behavior, and communication theory for reliably transmitting
the sensor and controller signals under limited communication
resources. Since both estimation and control rely on the infor-
mation delivered by the communication system, the WNCS
design for achieving the optimal control performance should
jointly take into account the estimation, control, and communi-
cation algorithms that tightly interact with each other. Ideally,
those algorithms should be jointly designed to optimize the
control performance of WNCSs under resource constraints.
Although the concept of control-communication co-design
in WNCSs was proposed decades ago (see [1] and references
therein), most related works from different research societies
were built on the separative design principle. The communi-
cations society focuses solely on improving the communica-
tions performance such as data rate, latency, and reliability,
without taking into account the WNCS system dynamics, or
performance [2], [3]. Although in the 5G era, ultra-reliable
low latency communications have been proposed for mission-
critical control applications, the prevailing design principle is
standalone and not tailored to any control applications, where
the control performance is not treated as a design objective [4].
On the other hand, the control system society’s effort on
WNCSs mainly focuses on the control (and estimation) algo-
rithm design with predetermined communications policies (see
[5] and its follow-up works). The recent works on control-
communication co-design for WNCSs can be categorized
into two streams:control-aware communication design and
control-communication policy co-design.
In Stream 1, communication protocols are optimized to
achieve the best control performance or under certain control-
related constraints. In [6], [7], transmission scheduling and
arXiv:2210.00673v1 [eess.SY] 3 Oct 2022
2
power allocation problems of WNCSs were investigated for
achieving the minimum overall transmission power consump-
tion while guaranteeing certain control performance. In [8],
a control-aware scheduler design problem was considered
based on the communications protocol of IEEE 802.15.4.
In [9], a communication protocol with variable packet length
was proposed and optimized for achieving the best control
performance. In [10], a transmission power allocation problem
of a WNCS with a coding-free communication protocol was
investigated, aiming to achieve optimal overall control perfor-
mance. In [11], a novel framework was developed for jointly
optimizing the communication design parameters to achieve
the best control performance. Note that all those works are
restricted to linear dynamical systems with linear control laws.
For remote state estimation of linear WNCSs, transmission
scheduling problems have drawn significant attention. In [12]–
[17], optimal scheduling policies were investigated for various
system setups to minimize average estimation errors.
In Stream 2, both the control and communication policies
are jointly optimized to achieve the overall control perfor-
mance. Stream 2 is more challenging due to the fact that the
joint policy has very large combined state and action spaces
when taking into account both control and communications
domains. In a nutshell, most co-design problems can be formu-
lated as dynamic decision-making ones. However, considering
large state and action spaces, conventional solutions such as
the Markov decision process cannot be applied due to the
curse-of-dimensionality. To solve this issue, most works in this
stream rely on deep-learning (DL) approaches with artificial
neural networks (NNs) for function approximations. In [18], a
deep reinforcement learning (DRL) approach was adopted to
learn both the control and the transmission scheduling signals.
In particular, DRL combines artificial NNs with a framework
of reinforcement learning that helps software agents learn
how to solve decision-making problems and reach their goals.
In [19], both the control policy and the dynamic transmission
power allocation policy were jointly optimized based on DRL.
It is worth noting that those DRL-based algorithms are model-
free and are applied to the practical WNCS scenario that does
not require accurate knowledge of the nonlinear system (plant)
models, while the conventional solutions are purely model-
based.
There are still many open problems in the area of control-
communications policy co-design with unknown nonlinear
system models. Many existing works, such as [18], [19],
assume that the sensor measurements are perfect and the
(uplink) communications between sensor-controller are error-
free. Under such an assumption, the controller has an accurate
plant state in real time for generating control signals. When
considering a practical uplink channel, the controller does not
always know the plant state and thus needs state estimation.
This requires estimation-control co-design. A key aspect is
that the estimation quality significantly depends on the age
of the sensor’s information available to the estimator, which
measures the time duration since the controller’s last packet
received from the sensor. Due to system dynamics and un-
certainties, a larger age-of-information (AoI) of the sensor
indicates a less reliable state estimate. For real-time control
applications, an estimate with a small AoI is more important
than the one with a large AoI. Such information about the data
importance needs to be taken into account for the controller’s
training. We note that the analysis and optimization of AoI
in different communication networks have drawn significant
attention during the past five years [20]. However, how to
leverage the AoI of sensor data for effectively training
a controller has not been considered before. Furthermore,
when considering DL-based estimator-control-communication
co-design, one needs to systematically design a joint train-
ing algorithm for achieving time and performance efficiency,
rather than training the three modules one by one. Otherwise,
the resulting estimation, control and communication policies
may not converge to desired ones, leading to poor overall
control performance of the WNCS. Due to aforementioned
difficulties, joint estimator-control-communication policy
learning for WNCSs has not been investigated in the open
literature.
In this work, we systematically investigate a DL-based
estimator-control-scheduler co-design framework for a model-
unknown WNCS with nonlinear dynamic systems. We con-
sider fading channels between sensor-controller and controller-
actuator. The major contributions are summarized as follows.
We propose a novel DL-based WNCS over fading chan-
nels with time correlations. In particular, the AoI states of
the sensor’s information are utilized in the three modules
of estimator, controller, and scheduler; both the controller
and the scheduler leverage the fading channel states for
decision-making. The instantaneous and historical states
are utilized in each module. Co-design frameworks for
WNCSs with the awareness of AoI and channel states
have not been considered in the open literature.
We develop a joint estimator-controller-scheduler train-
ing algorithm. In particular, we propose a DRL-based
algorithm for controller and scheduler optimization uti-
lizing both the model-free data that are received from
the sensor directly and the model-based data that are
generated by the estimator, when packet dropout occurs.
An AoI-based importance sampling algorithm that takes
into account the data accuracy is proposed for enhancing
learning efficiency. Moreover, we develop novel schemes
for enhancing the stability of joint training.
Extensive experiments building on the OpenAI Gym
platform demonstrate that the proposed joint training
algorithm can effectively solve the estimation-control-
scheduling co-design problem in various scenarios. Re-
markable performance gains have been achieved com-
pared to the separative design and some benchmark
policies.
Outline: The system model of a general WNCS over fading
channels is described in Section II. The estimation and control
co-design problems of a low-mobility and a high-mobility
WNCS were investigated in Sections III and IV, respectively.
The numerical results are demonstrated and discussed in
Section V, followed by conclusions in Section VI.
3
Critic
Intelligent
Controller
Rx
 
  
     
Plant Sensor
Actuator
Rx
Static Channels
Histories
Controller History Estimator History
Actor
 
 
Estimator

 
  
AoI Counter
Remote Controller
 
 

 
 
Fig. 2: Estimator-controller co-design of the low-mobility WNCS.
II. SYSTEM MODEL
A. WNCS Model
We consider a wireless networked control system as shown
in Fig. 2. The plant is a discrete-time nonlinear system as
st+1 =f(st, ut) + νt(1)
ot=st+vt(2)
where stRnsand utRnuare the plant state and the
control input from the actuator at time t, respectively. In
particular, the nonlinear dynamics f:Rns×RnuRns
is unknown to the remote controller, and νtis the plant
disturbance. otRnsis the sensor measurement of the plant
state staffected by the measurement noise vtRns.
We model the uplink channel and the downlink channel as
m-state Markov fading channels [21]. The channel states of
the uplink and the downlink are denoted by bU
t∈ WU,
{wU
1, . . . , wU
m}and bD
t∈ WD,{wD
1, . . . , wD
m}, respectively.
Let pU
i,j and pD
i,j denote the channel state transition probabili-
ties from state ito jof the uplink channel and the downlink
channel, respectively, i.e.,
pU
i,j ,Prob[bU
t+1 =wU
j|bU
t=wU
i],
pD
i,j ,Prob[bD
t+1 =wD
j|bD
t=wD
i].(3)
Then, the channel state transition probability matrices of the
channels are
MU,
pU
1,1. . . pU
m,1
.
.
.....
.
.
pU
1,m . . . pU
m,m
(4)
and
MD,
pD
1,1. . . pD
m,1
.
.
.....
.
.
pD
1,m . . . pD
m,m
.(5)
We assume that the instantaneous channel states, i.e., bU
tand
bD
tare known to the controller by classical channel estimation
schemes [22], while the dynamic channel models, i.e., MU
and MDare not available.
Let the binary variables ρU
t∈ {0,1}and ρD
t∈ {0,1}denote
transmission failure and success of the uplink channel and
the downlink channel at time t, respectively. The packet error
probabilities at different channel states are
dU
i,Prob[ρU
t= 0|bU
t=wU
i],i∈ {1, . . . , m}(6)
and
dD
i,Prob[ρD
t= 0|bD
t=wD
i],i∈ {1, . . . , m}.(7)
We assume that the actuator sends the one-bit acknowledge
information ρD
tto the controller via a perfect feedback
channel. This is a widely adopted assumption in wireless
communications.
B. Control and transmission schedule
We consider both a low-mobility scenario (e.g., process
control systems in factories) and a high-mobility scenario
(e.g., unmanned aerial vehicles) of the WNCS. In the former
scenario, the channel coherence time is much longer than each
control time slot, and thus channel state is static. We note
that the system is stochastic in this scenario. Therefore, the
Markov fading channels in (4) and (5) degrade to additive
white Gaussian noise (AWGN) channels with constant channel
states (i.e., m= 1 in (6) and (7)). Due to the low mobility,
sensors are often able to be connected to power grids, and
the transmission power consumption is not a major concern.
For the latter scenario, sensors are commonly powered by
batteries. Due to the costly battery replacement operations, it
is of significant interest to reduce the uplink transmission rate
while guaranteeing a certain level of desired control quality.
Therefore, an uplink transmission scheduler implemented at
the controller will schedule the sensor’s transmissions only
when it is necessary. Let aTx
t∈ {0,1}denote the scheduling
action and S(·)denote the scheduling function mapping from
input states to aTx
t. We will discuss the input states in the
following section.
摘要:

1DeepLearningforWirelessNetworkedSystems:ajointEstimation-Control-SchedulingApproachZihuaiZhao,WanchunLiu*,Member,IEEE,DanielE.Quevedo,Fellow,IEEE,YonghuiLi,Fellow,IEEE,andBrankaVuceticFellow,IEEEAbstract—Wirelessnetworkedcontrolsystem(WNCS)con-nectingsensors,controllers,andactuatorsviawirelesscommu...

展开>> 收起<<
1 Deep Learning for Wireless Networked Systems a joint Estimation-Control-Scheduling Approach.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:1.07MB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注