Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning Mohsen. ZahmatkeshSeyyed. Ali. Emami

2025-05-02 0 0 1.11MB 7 页 10玖币

侵权投诉

Attitude Control of Highly Maneuverable

Aircraft Using an Improved Q-learning

Mohsen. Zahmatkesh ∗Seyyed. Ali. Emami ∗

Afshin. Banazadeh ∗Paolo. Castaldi ∗∗

∗Aerospace Engineering Department, Sharif University of Technology,

Tehran, Iran (e-mail: banazadeh@sharif.edu).

∗∗ Department of Electrical, Electronic and Information Engineering

”Guglielmo Marconi”, University of Bologna, Via Dell’Universit‘a 50,

Cesena, Italy (e-mail: paolo.castaldi@unibo.it)

Abstract: Attitude control of a novel regional truss-braced wing aircraft with low stability

characteristics is addressed in this paper using Reinforcement Learning (RL). In recent years,

RL has been increasingly employed in challenging applications, particularly, autonomous

ﬂight control. However, a signiﬁcant predicament confronting discrete RL algorithms is the

dimension limitation of the state-action table and diﬃculties in deﬁning the elements of the

RL environment. To address these issues, in this paper, a detailed mathematical model of the

mentioned aircraft is ﬁrst developed to shape an RL environment. Subsequently, Q-learning,

the most prevalent discrete RL algorithm will be implemented in both the Markov Decision

Process (MDP), and Partially Observable Markov Decision Process (POMDP) frameworks to

control the longitudinal mode of the air vehicle. In order to eliminate residual ﬂuctuations

that are a consequence of discrete action selection, and simultaneously track variable pitch

angles, a Fuzzy Action Assignment (FAA) method is proposed to generate continuous control

commands using the trained Q-table. Accordingly, it will be proved that by deﬁning an accurate

reward function, along with observing all crucial states (which is equivalent to satisfy the

Markov Property), the performance of the introduced control system surpasses a well-tuned

Proportional–Integral–Derivative (PID) controller.

Keywords: Reinforcement Learning, Q-learning, Fuzzy Q-learning, Attitude Control,

Truss-braced Wing, Flight Control

1. INTRODUCTION

The aviation industry is expeditiously growing due to

world demands such as reducing fuel burn, emissions, and

cost, as well as providing the faster and safer ﬂight. This

motivates the advent of new airplanes with novel conﬁg-

urations. In addition, the scope clause agreement limits

the number of seats in each aircraft and ﬂight outsourcing

to protect the union pilot jobs. This factor leads to an

increase in production of the Modern Regional Jet (MRJ)

airplane. In this regard, the importance of a safe ﬂight

becomes more vital considering more crowded airspace

and new aircraft conﬁgurations having the ability to ﬂy

faster. Truss-braced wing aircraft is one of the re-raised

high-performance conﬁgurations, which has attracted sig-

niﬁcant attention from both academia (Li et al., 2022) and

industry (Sarode, 2022) due to its fuel burn eﬃciency. As

a result, there would be a growing need for reliable model-

ing and simulations, analyzing the ﬂight handling quality,

and stability analysis for such conﬁgurations (Nguyen and

Xiong, 2022; Zavaree et al., 2021), while very few studies

have addressed the ﬂight control design for this aircraft.

In the last decades, various classic methods for aircraft

attitude control have been developed to enhance control

performance. However, the most signiﬁcant deﬁciency of

these approaches is the insuﬃcient capability to deal with

unexpected ﬂight conditions, while typically requiring a

detailed dynamic model of the system.

Recently, the application of Reinforcement Learning (RL)

has been extended to real problems, particularly, ﬂight

control design (Emami et al., 2022). Generally, there are

two main frameworks to incorporate RL in the control

design process, i.e., the high-level and low-level control

systems. In Xi et al. (2022), a Soft Actor-Critic (SAC)

algorithm was implemented in a path planning problem

for a long-endurance solar-powered UAV with energy-

consuming considerations. Another work (Bøhn et al.,

2021) concentrated on the inner loop control of a Sky-

walker X8 using SAC and comparing it with a PID con-

troller. In Yang et al. (2020) a ANN based Q-learning hor-

izontal trajectory tracking controller was developed based

on the MDP model of an airship with ﬁne stability charac-

teristics. Apart from the previous method, Proximal Policy

Optimization (PPO) was utilized in Hu et al. (2022) for

orientation control of a common strongly dynamic coupled

ﬁxed-wing aircraft in the stall condition. The PPO was

successful to be converged after 100000 episodes. However,

useful to say that the PPO performance is adequate to

optimize PID controllers (Dammen, 2022).

arXiv:2210.12317v1 [eess.SY] 22 Oct 2022

There are some papers on maneuver ﬂight such as landing

phase control both in inner and outer loops. For instance,

in Wang et al. (2018), a Deep Q-learning (DQL) is used to

guide an aircraft to land in the desired ﬁeld. In Yuan et al.

(2019), a Deep Deterministic Policy Gradient (DDPG) was

implemented for a UAV to control either path-tracking

for landing glide slope and attitude control for landing

ﬂare section. Similarly, a DDPG method in Tang and Lai

(2020) is used to control outer loop of a landing procedure

in presence of wind disturbance. The works which have

been referred to so far accompanied ANNs to be able to

converge. But to our best knowledge, there is research

in attitude control using discrete RL without aiming

ANNs. In Richter et al. (2022), a Q-learning algorithm was

implemented to control longitudinal and lateral angles in a

general aviation aircraft(Cessna 172). This airplane proﬁts

suitable stability characteristics and also desired angles are

zero. There are some fuzzy adaptations on Watkins and

Dayan (1992) work like Glorennec and Jouﬀe (1997) where

the Q-functions and action selection strategy are inferred

from fuzzy rules. Also, Er and Deng (2004) proposed a

dynamic fuzzy Q-learning for online and continuous tasks

in mobile robots.

Motivated by the above discussions, the main contribu-

tions of the current study can be summarized as follows:

a) A truss-braced wing aircraft (Chaka 50) (1) with poor

stability characteristics has been selected carefully for

attitude control alongside responding to global aviation

society demands. b) It will be proven that the Q-learning

performance in control problems strictly depends on re-

ward function and problem deﬁnition. So, it is able to

have prosperous results even in a low stable high degree of

freedom plants. c) In this work, the performance of Q-

learning will be examined in both MDP and POMDP

problem modelings. Also, the learned Q-table is used

to generate continuous elevator deﬂections using Fuzzy

Action Assignment (FAA) illustrating Q-table capability

tracking the desired angle and also variable angles.

Fig. 1. Chaka MRJ Family (Zavaree et al., 2021)

2. MODELING AND SIMULATION

Nonlinear conservation of linear and angular momentum

equations are used for modeling and simulation according

to Zipfel (2014).

m"˙u

˙v

˙w#+m"0−r q

r0−p

−q p 0#"u

w#="FAx+Tx

FAy+Ty

FAz+Tz#+m"gx

gz#

(1)

"Ix0 0

0Iy0

0 0 Iz#"˙p

˙q

˙r#+"0−r q

r0−p

−q p 0#"Ix0 0

0Iy0

0 0 Iz#"p

r#="LA+T

MA+T

NA+T#

(2)

where assuming the moments of thrust are equal to zero

and also the thrust is only implying in xdirection. There-

fore, LT=MT=NT=FTy=FTz= 0. and the aerody-

namic forces and moments in body axis are as follows:

"FAx

FAz

MA#B

= ¯qS¯c



cL0cLαcL˙αcLucLqcLδE

cD0cDαcD˙αcDucDqcDδE

cm0cmαcm˙αcmucmqcmδE











˙α¯c

2VP1

VP1

q¯c

2VP1

δE







(3)

The vector of gravity acceleration in the body axis deﬁned

by (1) is as follows:

"gx

gz#B

=(−gsin(θ)

gcos(θ) sin(φ)

gcos(θ) cos(φ))(4)

And also, the rotational kinematic equations are necessary

for transfer from body to inertia coordinates.





ψ

="1 sin ϕtan θcos ϕtan θ

0 cos ϕ−sin ϕ

0 sin ϕ/ cos θcos ϕ/ cos θ#"p

r#(5)

using (1), and (5), velocity vector in inertia coordinate is

achievable. For contraction reasons; sin = s,cos = c.

"˙x

˙y

˙z#I

="cψcθcψsθsϕ−sψcϕcψsθcϕ+ sψsϕ

sψcθsψsθsϕ+ cψcϕsψsθcϕ−cψsϕ

−sθcθsϕcθcϕ#"u

w#B

(6)

Stability and control derivatives for the Chaka-50 are

reported in Zavaree et al. (2021) based on Computational

Fluid Dynamics (CFD). The summary of these derivatives

for two ﬂight phases is given in table (1). before six-degree-

of-freedom (6DoF) simulation using equations (1-2), the

trim conditions in a wings-level ﬂight are calculated for

simulation veriﬁcation based on trim equations in Roskam

(1998). In drag equation, absolute value of δE,iH1,α1is

considered. Also, ﬂight path angle γ1, motor installation

angle φT, and horizontal tail incidence angle iH, are zero.

The elevator deﬂection δE, and required thrust T1for a

trim ﬂight is obtained and shown in table (2). The values in

the table (2) are important for 6DoF simulation validation.

Atmospheric Disturbance and Sensor Measurement Noise

This research has utilized the Dryden atmosphere tur-

bulence for its simple mathematical modeling.

Gw(s) = σwrLw

πu1"1 + √3Lw

u1s

1+(Lw

u1s)2#(7)

This model is applied in wdirection where Lw=h=

100m,σw= 10, and u1= 160 m

s. In addition, the sensor

noise is deﬁned as 10% of sensor measurement. Also, the

geometric, mass, and moment of inertia data are given in

the table (3).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AttitudeControlofHighlyManeuverableAircraftUsinganImprovedQ-learningMohsen.ZahmatkeshSeyyed.Ali.EmamiAfshin.BanazadehPaolo.CastaldiAerospaceEngineeringDepartment,SharifUniversityofTechnology,Tehran,Iran(e-mail:banazadeh@sharif.edu).DepartmentofElectrical,ElectronicandInformationEngineering"G...

展开>> 收起<<

Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning Mohsen. ZahmatkeshSeyyed. Ali. Emami.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Attitude Control of Highly Maneuverable Aircraft Using an Improved Q-learning Mohsen. ZahmatkeshSeyyed. Ali. Emami

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: