Enhanced CNN with Global Features for Fault Diagnosis of Complex Chemical Processes Qiugang Lua1 and Saif S. S. Al-Wahaibia

2025-04-29 0 0 852.25KB 6 页 10玖币
侵权投诉
Enhanced CNN with Global Features for Fault Diagnosis of Complex
Chemical Processes
Qiugang Lu a,1, and Saif S. S. Al-Wahaibi a
aDepartment of Chemical Engineering, Texas Tech University, Lubbock, TX 79409, USA
Abstract
Convolutional neural network (CNN) models have been widely used for fault diagnosis of complex systems.
However, traditional CNN models rely on small kernel filters to obtain local features from images. Thus, an excessively
deep CNN is required to capture global features, which are critical for fault diagnosis of dynamical systems. In this
work, we present an improved CNN that embeds global features (GF-CNN). Our method uses a multi-layer perceptron
(MLP) for dimension reduction to directly extract global features and integrate them into the CNN. The advantage of
this method is that both local and global patterns in images can be captured by a simple model architecture instead of
establishing deep CNN models. The proposed method is applied to the fault diagnosis of the Tennessee Eastman process.
Simulation results show that the GF-CNN can significantly improve the fault diagnosis performance compared to
traditional CNN. The proposed method can also be applied to other areas such as computer vision and image processing.
Keywords
Convolutional neural network, Global features, Fault diagnosis, Chemical process.
1. Introduction
Deep learning methods have prevailed in various fault di-
agnosis applications due to their unique effectiveness in auto-
matic feature extractions directly from raw data Zhang et al.
(2020). Research has been reported on using deep learning
methods, e.g., deep belief network Shao et al. (2015), recur-
rent neural network Jiang et al. (2018), auto-encoders Zheng
and Zhao (2020), and convolutional neural networks (CNN)
Wen et al. (2017). Exemplary applications include fault di-
agnosis of wind turbine gearbox Jing et al. (2017), rotating
machine bearings Zhao et al. (2019), and complex chemical
processes Huang et al. (2022). Such deep learning models
can extract abstract features from the data and capture non-
linearity, thereby exhibiting superior performance than tradi-
tional machine learning methods Wen et al. (2017).
Among the variety of deep learning methods that have
been explored, CNN has attracted wide attention due to its
excellent performance in complex feature learning and clas-
sification Zhang et al. (2017). CNN was firstly used by
Janssens et al. (2016) for fault detection of rotating machin-
ery. Further advancements include Wen et al. (2017); Zhang
et al. (2018) and the references therein, where a number of
strategies were proposed for converting time-series data into
images. To capture different levels of features, multi-scale
CNN has been put forward where different kernel sizes are
proposed, e.g., Chen et al. (2021). Along this line, wide
kernels have been employed as the first few layers of CNN,
1Corresponding author: Q. Lu (E-mail: jay.lu@ttu.edu).
followed by small kernels for improving the feature repre-
sentation Zhang et al. (2017); van den Hoogen et al. (2020);
Song et al. (2022). However, these reported studies mainly
focus on the fault diagnosis of rotating machinery where the
number of variables is small. Study on CNNs and their vari-
ants for fault diagnosis of complex chemical processes still
remains limited Wu and Zhao (2018).
Complex chemical processes are featured by high dimen-
sions, with strong spatial and temporal correlations among
variables Lu et al. (2019). CNN model-based fault diagnosis
for chemical processes has only been preliminarily attempted
Wu and Zhao (2018); Huang et al. (2022). The obtained diag-
nosis performance is still far from satisfaction for real-world
applications Shao et al. (2019). Moreover, existing research
mainly focuses on extracting local features from process
data, despite of the efforts on multi-scale CNN to enlarge
the receptive field Song and Jiang (2022). For chemical pro-
cesses, global features are also critical due to the complex in-
terconnection of multiple units and thus the widespread cou-
pling between process variables. Specifically, when forming
images from multivariate time-series data, variables that are
far apart may also possess strong correlations (e.g., see Fig.
3 below). Traditional CNN methods, including multi-scale
CNN, cannot directly capture global features and often re-
quire deep layers to expand the receptive field to the entire
image Song and Jiang (2022). This motivates us to develop a
novel CNN architecture that can extract both global and local
features in images to improve the fault diagnosis rate.
In this work, we present a novel global feature-enhanced
CNN (GF-CNN) for the fault diagnosis of complex chemi-
arXiv:2210.01727v1 [eess.SY] 4 Oct 2022
cal processes. Specifically, in parallel with the convolutional
and pooling layers in CNN, we employ a multi-layer per-
ceptron (MLP) network for dimension reduction. That is,
the MLP directly maps the vectorized input images to a low-
dimensional feature vector, which is then concatenated with
the first fully-connected layer of the CNN. Similar CNN-
MLP architecture has been employed in Sinitsin et al. (2022);
Ahsan et al. (2020). However, those works are towards us-
ing CNN and MLP to handle multiple input data types (e.g.,
images and numerical data), instead of using MLP for di-
mension reduction to assist CNN to capture global features.
Moreover, those works focus on bearing fault diagnosis and
COVID-19 detection, in contrast to our work on fault diag-
nosis of complex chemical processes.
The outline of this paper is as follows. Section 2 presents
the fundamentals of CNN relevant to this work. Section 3
describes the proposed GF-CNN in detail, followed by a case
study on the Tennessee Eastman process in Section 4. The
conclusion is given in Section 5.
2. Fundamentals of CNN
For a typical CNN network, convolution and pooling are
the two critical operations. In addition, dropout is often em-
ployed for regularization to prevent overfitting.
2.1 Convolutional Layers
For typical convolutional layers, consider a generic 2D
feature map xl
iRh×don layer l{1,...,L}and channel
i{1,...,Nl}, where hand dare the height and width of the
feature map. The output after a convolutional kernel kl
i,jis
shown to be Zhao et al. (2019)
xl+1
j=f Nl
i=1
kl
i,jxl
i+bl
j!,j=1,...,Nl+1,(1)
where xl+1
jis the j-th feature map in layer l+1, is the
convolution operation, and Nl+1is the number of kernel fil-
ters, i.e., the number of channels, in layer l+1. Commonly
used activation functions f(·)include ReLU, leaky ReLU, or
sigmoid function. In this work, we choose to use the ReLU
function as the nonlinear activation. When conducting the
convolution, a kernel (often square) slides through the entire
feature map with a stride s. With each stride, the convolution
operation above is carried out. In addition, if needed, zero
padding can be added to the input map to carry its dimension
to the new feature maps.
2.2 Pooling Layers
Pooling is an operation that is often used after the convo-
lution layers. The purpose of adding the pooling layers is to
extract the major features in local regions of the new feature
maps after convolutional layers. Therefore, unnecessary de-
tails or noise can be filtered out. Moreover, the dimension of
the feature maps can be significantly reduced after pooling,
and so do the computation time and the number of param-
eters in the network. Common pooling techniques include
10 20 30 40
10
20
30
40
50
10 20 30 40
10
20
30
40
50
10 20 30 40
10
20
30
40
50
10 20 30 40
10
20
30
40
50
12
34
1234
Time
Variable
Normalization
Variable
Time
Figure 1: Illustration of signal-to-image conversion. Top:
plots of process variables over time; Bottom: the obtained
gray images after conversion.
average pooling, weighted pooling, and max pooling. In this
work, we adopt the max pooling technique Jing et al. (2017)
Pl+1
j=max
xl+1
jS
xl+1
j,(2)
where Sis pooling block size, Pl+1
jis the output of the j-th
feature map in the (l+1)-th layer after pooling. The dimen-
sion of the feature map is then reduced Stimes.
2.3 Fully-connected (FC) Layers
The FC layers are located after all convolutional and pool-
ing layers, serving to classify the extracted features from im-
ages. Prior to FC layers, the obtained feature maps are flat-
tened into vectors. For the FC part, the output of the j-th
neuron in the l-th layer is
xl
j=f Ml1
i=1
xl1
iwl
i,j+bl
j!,j=1,...,Ml,(3)
where Ml1and Mlare the input and output dimensions for
the l-th layer, respectively. wl
i,jis the weight parameter. The
output from the last layer of the FC network is then passed
into the softmax function for generating the probability of the
input image being classified to each class.
2.4 Dropout
To prevent overfitting, dropout is a common technique
used in CNN. When the dropout is appended to a layer, the
output of each neuron will be set to zero with a given prob-
ability. For instance, dropout pd p =0.5 means that each
neuron’s output value is set to 0 with probability 0.5. The
摘要:

EnhancedCNNwithGlobalFeaturesforFaultDiagnosisofComplexChemicalProcessesQiugangLua;1,andSaifS.S.Al-WahaibiaaDepartmentofChemicalEngineering,TexasTechUniversity,Lubbock,TX79409,USAAbstractConvolutionalneuralnetwork(CNN)modelshavebeenwidelyusedforfaultdiagnosisofcomplexsystems.However,traditionalCNNmo...

展开>> 收起<<
Enhanced CNN with Global Features for Fault Diagnosis of Complex Chemical Processes Qiugang Lua1 and Saif S. S. Al-Wahaibia.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:852.25KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注