Enhanced CNN with Global Features for Fault Diagnosis of Complex Chemical Processes Qiugang Lua1 and Saif S. S. Al-Wahaibia

2025-04-29 0 0 852.25KB 6 页 10玖币

侵权投诉

Enhanced CNN with Global Features for Fault Diagnosis of Complex

Chemical Processes

Qiugang Lu a,1, and Saif S. S. Al-Wahaibi a

aDepartment of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA

Abstract

Convolutional neural network (CNN) models have been widely used for fault diagnosis of complex systems.

However, traditional CNN models rely on small kernel ﬁlters to obtain local features from images. Thus, an excessively

deep CNN is required to capture global features, which are critical for fault diagnosis of dynamical systems. In this

work, we present an improved CNN that embeds global features (GF-CNN). Our method uses a multi-layer perceptron

(MLP) for dimension reduction to directly extract global features and integrate them into the CNN. The advantage of

this method is that both local and global patterns in images can be captured by a simple model architecture instead of

establishing deep CNN models. The proposed method is applied to the fault diagnosis of the Tennessee Eastman process.

Simulation results show that the GF-CNN can signiﬁcantly improve the fault diagnosis performance compared to

traditional CNN. The proposed method can also be applied to other areas such as computer vision and image processing.

Keywords

Convolutional neural network, Global features, Fault diagnosis, Chemical process.

1. Introduction

Deep learning methods have prevailed in various fault di-

agnosis applications due to their unique effectiveness in auto-

matic feature extractions directly from raw data Zhang et al.

(2020). Research has been reported on using deep learning

methods, e.g., deep belief network Shao et al. (2015), recur-

rent neural network Jiang et al. (2018), auto-encoders Zheng

and Zhao (2020), and convolutional neural networks (CNN)

Wen et al. (2017). Exemplary applications include fault di-

agnosis of wind turbine gearbox Jing et al. (2017), rotating

machine bearings Zhao et al. (2019), and complex chemical

processes Huang et al. (2022). Such deep learning models

can extract abstract features from the data and capture non-

linearity, thereby exhibiting superior performance than tradi-

tional machine learning methods Wen et al. (2017).

Among the variety of deep learning methods that have

been explored, CNN has attracted wide attention due to its

excellent performance in complex feature learning and clas-

siﬁcation Zhang et al. (2017). CNN was ﬁrstly used by

Janssens et al. (2016) for fault detection of rotating machin-

ery. Further advancements include Wen et al. (2017); Zhang

et al. (2018) and the references therein, where a number of

strategies were proposed for converting time-series data into

images. To capture different levels of features, multi-scale

CNN has been put forward where different kernel sizes are

proposed, e.g., Chen et al. (2021). Along this line, wide

kernels have been employed as the ﬁrst few layers of CNN,

1Corresponding author: Q. Lu (E-mail: jay.lu@ttu.edu).

followed by small kernels for improving the feature repre-

sentation Zhang et al. (2017); van den Hoogen et al. (2020);

Song et al. (2022). However, these reported studies mainly

focus on the fault diagnosis of rotating machinery where the

number of variables is small. Study on CNNs and their vari-

ants for fault diagnosis of complex chemical processes still

remains limited Wu and Zhao (2018).

Complex chemical processes are featured by high dimen-

sions, with strong spatial and temporal correlations among

variables Lu et al. (2019). CNN model-based fault diagnosis

for chemical processes has only been preliminarily attempted

Wu and Zhao (2018); Huang et al. (2022). The obtained diag-

nosis performance is still far from satisfaction for real-world

applications Shao et al. (2019). Moreover, existing research

mainly focuses on extracting local features from process

data, despite of the efforts on multi-scale CNN to enlarge

the receptive ﬁeld Song and Jiang (2022). For chemical pro-

cesses, global features are also critical due to the complex in-

terconnection of multiple units and thus the widespread cou-

pling between process variables. Speciﬁcally, when forming

images from multivariate time-series data, variables that are

far apart may also possess strong correlations (e.g., see Fig.

3 below). Traditional CNN methods, including multi-scale

CNN, cannot directly capture global features and often re-

quire deep layers to expand the receptive ﬁeld to the entire

image Song and Jiang (2022). This motivates us to develop a

novel CNN architecture that can extract both global and local

features in images to improve the fault diagnosis rate.

In this work, we present a novel global feature-enhanced

CNN (GF-CNN) for the fault diagnosis of complex chemi-

arXiv:2210.01727v1 [eess.SY] 4 Oct 2022

cal processes. Speciﬁcally, in parallel with the convolutional

and pooling layers in CNN, we employ a multi-layer per-

ceptron (MLP) network for dimension reduction. That is,

the MLP directly maps the vectorized input images to a low-

dimensional feature vector, which is then concatenated with

the ﬁrst fully-connected layer of the CNN. Similar CNN-

MLP architecture has been employed in Sinitsin et al. (2022);

Ahsan et al. (2020). However, those works are towards us-

ing CNN and MLP to handle multiple input data types (e.g.,

images and numerical data), instead of using MLP for di-

mension reduction to assist CNN to capture global features.

Moreover, those works focus on bearing fault diagnosis and

COVID-19 detection, in contrast to our work on fault diag-

nosis of complex chemical processes.

The outline of this paper is as follows. Section 2 presents

the fundamentals of CNN relevant to this work. Section 3

describes the proposed GF-CNN in detail, followed by a case

study on the Tennessee Eastman process in Section 4. The

conclusion is given in Section 5.

2. Fundamentals of CNN

For a typical CNN network, convolution and pooling are

the two critical operations. In addition, dropout is often em-

ployed for regularization to prevent overﬁtting.

2.1 Convolutional Layers

For typical convolutional layers, consider a generic 2D

feature map xl

i∈Rh×don layer l∈{1,...,L}and channel

i∈{1,...,Nl}, where hand dare the height and width of the

feature map. The output after a convolutional kernel kl

i,jis

shown to be Zhao et al. (2019)

xl+1

j=f Nl

∑

i=1

i,j∗xl

i+bl

j!,j=1,...,Nl+1,(1)

where xl+1

jis the j-th feature map in layer l+1, ∗is the

convolution operation, and Nl+1is the number of kernel ﬁl-

ters, i.e., the number of channels, in layer l+1. Commonly

used activation functions f(·)include ReLU, leaky ReLU, or

sigmoid function. In this work, we choose to use the ReLU

function as the nonlinear activation. When conducting the

convolution, a kernel (often square) slides through the entire

feature map with a stride s. With each stride, the convolution

operation above is carried out. In addition, if needed, zero

padding can be added to the input map to carry its dimension

to the new feature maps.

2.2 Pooling Layers

Pooling is an operation that is often used after the convo-

lution layers. The purpose of adding the pooling layers is to

extract the major features in local regions of the new feature

maps after convolutional layers. Therefore, unnecessary de-

tails or noise can be ﬁltered out. Moreover, the dimension of

the feature maps can be signiﬁcantly reduced after pooling,

and so do the computation time and the number of param-

eters in the network. Common pooling techniques include

10 20 30 40

1234

Time

Variable

Normalization

Variable

Time

Figure 1: Illustration of signal-to-image conversion. Top:

plots of process variables over time; Bottom: the obtained

gray images after conversion.

average pooling, weighted pooling, and max pooling. In this

work, we adopt the max pooling technique Jing et al. (2017)

Pl+1

j=max

xl+1

j∈S

xl+1

j,(2)

where Sis pooling block size, Pl+1

jis the output of the j-th

feature map in the (l+1)-th layer after pooling. The dimen-

sion of the feature map is then reduced Stimes.

2.3 Fully-connected (FC) Layers

The FC layers are located after all convolutional and pool-

ing layers, serving to classify the extracted features from im-

ages. Prior to FC layers, the obtained feature maps are ﬂat-

tened into vectors. For the FC part, the output of the j-th

neuron in the l-th layer is

j=f Ml−1

∑

i=1

xl−1

iwl

i,j+bl

j!,j=1,...,Ml,(3)

where Ml−1and Mlare the input and output dimensions for

the l-th layer, respectively. wl

i,jis the weight parameter. The

output from the last layer of the FC network is then passed

into the softmax function for generating the probability of the

input image being classiﬁed to each class.

2.4 Dropout

To prevent overﬁtting, dropout is a common technique

used in CNN. When the dropout is appended to a layer, the

output of each neuron will be set to zero with a given prob-

ability. For instance, dropout pd p =0.5 means that each

neuron’s output value is set to 0 with probability 0.5. The

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

EnhancedCNNwithGlobalFeaturesforFaultDiagnosisofComplexChemicalProcessesQiugangLua;1,andSaifS.S.Al-WahaibiaaDepartmentofChemicalEngineering,TexasTechUniversity,Lubbock,TX79409,USAAbstractConvolutionalneuralnetwork(CNN)modelshavebeenwidelyusedforfaultdiagnosisofcomplexsystems.However,traditionalCNNmo...

展开>> 收起<<

Enhanced CNN with Global Features for Fault Diagnosis of Complex Chemical Processes Qiugang Lua1 and Saif S. S. Al-Wahaibia.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Enhanced CNN with Global Features for Fault Diagnosis of Complex Chemical Processes Qiugang Lua1 and Saif S. S. Al-Wahaibia

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: