Improving Convolutional Neural Networks for Fault Diagnosis by Assimilating Global Features Saif S. S. Al-Wahaibi1and Qiugang Lu1

2025-05-08 0 0 3.85MB 6 页 10玖币
侵权投诉
Improving Convolutional Neural Networks for Fault Diagnosis by
Assimilating Global Features
Saif S. S. Al-Wahaibi1and Qiugang Lu1,
Abstract Deep learning techniques have become prominent
in modern fault diagnosis for complex processes. In particular,
convolutional neural networks (CNNs) have shown an appealing
capacity to deal with multivariate time-series data by con-
verting them into images. However, existing CNN techniques
mainly focus on capturing local or multi-scale features from
input images. A deep CNN is often required to indirectly
extract global features, which are critical to describe the
images converted from multivariate dynamical data. This paper
proposes a novel local-global CNN (LG-CNN) architecture
that directly accounts for both local and global features for
fault diagnosis. Specifically, the local features are acquired by
traditional local kernels whereas global features are extracted
by using 1D tall and fat kernels that span the entire height
and width of the image. Both local and global features are
then merged for classification using fully-connected layers. The
proposed LG-CNN is validated on the benchmark Tennessee
Eastman process (TEP) dataset. Comparison with traditional
CNN shows that the proposed LG-CNN can greatly improve the
fault diagnosis performance without significantly increasing the
model complexity. This is attributed to the much wider local
receptive field created by the LG-CNN than that by CNN. The
proposed LG-CNN architecture can be easily extended to other
image processing and computer vision tasks.
I. INTRODUCTION
Deep learning (DL) has attracted increasing attention for
fault detection and diagnosis (FDD) over the last decade.
Primarily, the strength of DL lies in its ability to utilize the
extensive data present in industrial systems to establish com-
plex models for distinguishing anomalies, diagnosing faults,
and forecasting without needing much prior knowledge [1].
Among various DL methods for FDD, convolutional neural
networks (CNNs) have shown great promise due to their effi-
ciency in capturing spatiotemporal correlations and reduced
trainable parameters from weight sharing [2].
Originally developed for image classification, CNNs entail
neural networks consisting of convolutions with local kernels
and pooling operations to extract features from images [2],
[3]. They have also been used in FDD to handle time-series
data. Janssens 𝑒𝑡 𝑎𝑙. [4] made one of the earliest attempts at
using CNNs for fault diagnosis. The authors highlighted the
capability of CNNs to learn new features from input images
converted from time-series data to better classify faults in
rotating machinery. Further developments on CNN for FDD
can be referred to [5]–[10]. Note that different from images
in computer vision, the images converted from time-series
*This work was supported by the Texas Tech University
1S. Al-Wahaibi and Q. Lu are with the Department of Chemical En-
gineering, Texas Tech University, Lubbock, TX 79409, USA. Email:
Saif.Al-Wahaibi@ttu.edu; Jay.Lu@ttu.edu
Corresponding author: Q. Lu
data often possess strong non-localized features. To this end,
kernels of different sizes, i.e., multi-scale CNN, have been
used in [5] to cover local receptive fields (LRF) with varying
resolutions to improve the learned features. Other techniques
such as global average pooling has been employed in [10],
[11] to maintain the integrity of information pertaining to
global correlations. However, these approaches either can
only directly capture wider local features (e.g., multi-scale
CNN) or lack learnable parameters in acquiring global corre-
lations (e.g., global average pooling). Thus, they often need
to construct deep networks to capture the global features that
are crucial in multivariate time-series data for FDD [12]. In
addition, most research studies mentioned above are mainly
concerned with 1D or low-dimensional time-series data such
as the wheel bearing data [6]. Research on extending CNN
for FDD for high-dimensional multivariate time-series data,
e.g., those obtained from chemical processes, still remains
limited. One exemplary work is reported in [7] where deep
CNNs are constructed to diagnose faults from the Tennessee
Eastman process (TEP). Gramian angular field is used in
[8] to convert multi-dimensional data into multi-channel 2D
images to apply CNN for fault diagnosis. Nevertheless, these
works still cannot directly extract global features from the
multivariate time-series data or equivalently, the formed 2D
images. Instead, they also rely on constructing deep CNNs to
expand the LRF to the entire image for capturing global cor-
relations. As a result, the number of trainable parameters can
easily go beyond several millions, causing significant training
complexity [7], [8]. Hence, there is a pressing demand
for developing a novel CNN-based FDD framework that
adequately extracts global spatiotemporal correlations while
maintaining a reasonable number of learnable parameters for
multivariate time-series datasets.
This paper proposes a novel local-global CNN (LG-CNN)
framework for fault diagnosis for complex dynamical pro-
cesses. The proposed framework converts multivariate time-
series data into images and simultaneously collects both
global and local features to classify faults. Local correlations
are captured using typical local square kernels, whereas
global correlations are integrated using 1D tall (temporal)
and fat (spatial) kernels that span the entire height and width
of the image. The spatial and temporal global features ex-
tracted from the tall and fat kernels are then cohered together
to acquire global spatiotemporal patterns in the images. Such
global spatiotemporal features are then concatenated with
local features extracted with the typical square kernels to
merge the information prior to fault diagnosis. The developed
LG-CNN is validated with the TEP data and simulation
arXiv:2210.01077v1 [cs.CV] 3 Oct 2022
results show that the incorporation of global features into
CNN can significantly enhance the diagnosis performance
without significantly increasing the model complexity.
This paper is organized as follows. Section II presents
fundamentals about traditional CNNs. The proposed LG-
CNN architecture is elaborated in Section III, followed by
a case study of fault diagnosis for TEP in Section IV. The
conclusions are given in Section V.
II. PRELIMINARIES
In this section, we briefly introduce the main components
in a typical CNN including convolutional, pooling, and fully-
connected (FC) layers. In addition, batch normalization (BN)
is introduced to mitigate the internal covariance shift issues.
A. Convolutional Layers
In a convolutional layer, a kernel filter slides across an in-
put feature map where an affine transformation is conducted
at every slide location such that:
𝐂𝑙
𝑗=𝑏𝑙
𝑗+𝐼𝑙−1
𝑖=1 𝐗𝑙−1
𝑖𝐊𝑙
𝑖,𝑗 , 𝑗 = 1,2,, 𝐼𝑙,(1)
where 𝐊𝑙
𝑖,𝑗 𝑘×𝜂is the kernel of size 𝑘×𝜂in layer
𝑙∈ {1,2,, 𝐿}and channel 𝑖∈ {1,2,, 𝐼𝑙−1},𝐗𝑙−1
𝑖
𝑛×𝑚is the input feature map of size 𝑛×𝑚to layer 𝑙.𝐿
is the number of layers and 𝐼𝑙is the number of channels
in the 𝑙-th layer. C𝑙
𝑗is the 𝑗-th output map in layer 𝑙after
the convolution and 𝑏𝑙
𝑗is the bias. The symbol represents
the convolution operation. An activation function, such as
the rectified linear unit (ReLU), is usually applied to C𝑙
𝑗to
add non-linearity. Graphically, the first green feature map
in Fig. 1 illustrates a 3×3 convolution. The square patch
of size 3 × 3 in the input image represents the LRF of the
dark green neuron output in the first feature map in the top
branch. Thereby, a LRF can be thought of as the “field of
view" incorporated in calculating a new feature through the
convolutional operation.
B. Pooling Layers
Pooling operations after convolutional layers often act as a
sub-sampling step to reduce dimensionality while preserving
information. Specifically, in a localized group of activa-
tions on a feature map, pooling summarizes their responses
through either averaging or maximizing operations [13]. We
use max pooling to each local 𝑠×𝑠region of the input feature
map X𝑙−1
𝑖𝑛×𝑚, and the resultant new feature map is
shown to be [11]
𝐏𝑙
𝑖= (max 𝐗𝑙−1
𝑖,𝑟 )𝑆
𝑟=1,(2)
where 𝐏𝑙
𝑖is the output of the max pooling operation, and 𝑆
is the total number of 𝑠×𝑠regions in X𝑙−1
𝑖.
C. Fully-Connected Layers
After all convolutional and pooling layers, the obtained
feature maps represent the main features learned by a CNN
from an input image. These maps are then flattened into a
vector and passed through FC layers for classification or
regression. Specifically, the output of the 𝑙-th FC layer is
calculated using
𝑎𝑙
𝑧=𝑏𝑙
𝑧+x𝑙−1 𝝎𝑙
𝑧, 𝑧 = 1,2,, 𝑍, (3)
where 𝑎𝑙
𝑧is the output of the 𝑧-th neuron, 𝑍is total number
of neurons in the 𝑙-th layer, 𝑏𝑙
𝑧is the bias, x𝑙−1 𝜁is
the activation vector from the previous layer that contains 𝜁
neurons, 𝝎𝑙
𝑧𝜁is the weight vector associated with neuron
𝑧, and represents the dot product. To add non-linearity, an
activation function is applied to 𝑎𝑙
𝑧.
D. Batch Normalization
BN accelerates CNN learning by reducing the effects of
internal covariance shift [14]. Layers experience these effects
in the learning process when previous layers update their
weights and biases resulting in a need to continuously adapt
to these changes, and therefore, hinder the learning process.
In a 2D-CNN, these effects are mitigated by normalizing the
activations from a preceding layer:
̃
X𝑙−1
𝑖=X𝑙−1
𝑖𝔼[X𝑙−1
𝑖]
𝕍 𝕒𝕣 [X𝑙−1
𝑖]
,(4)
where ̃
X𝑙−1
𝑖𝑛×𝑚is the 𝑖-th channel in the (𝑙− 1)-th layer,
𝔼[]is the expectation over the training batch and all pixel
locations, and 𝕍 𝕒𝕣 []is the variance. Then, representation is
restored to the layer by the affine computation [14]
Y𝑙
𝑖=̃
X𝑙−1
𝑖𝛼𝑙
𝑖+𝛽𝑙
𝑖,(5)
where Y𝑙
𝑖𝑛×𝑚is the BN output for channel 𝑖in layer 𝑙,
and 𝛼𝑙
𝑖and 𝛽𝑙
𝑖are learnable parameters for each channel.
III. METHODOLOGY
The proposed LG-CNN shown in Fig. 1 consists of multi-
scale convolutions to extract both local and global features.
In addition, we use 1 × 1 convolution [6], max pooling,
and strided convolution [15] for dimension reduction with
minimum information loss.
A. Local Correlations
The top branch in LG-CNN (green color in Fig. 1) shows
the usage of traditional 3 × 3 kernels to extract local features
from the input image. Note that here we apply BN steps
before the ReLU activations. Further, padding is added to
ensure that the output has the same dimension as the input
image. In addition, to reduce dimensions, 1×1 convolution
is conducted to squeeze the number of channels after the
convolution. Note that traditional 3 kernels usually capture
a small LRF region [3], [9]. The overall mapping from
the input image to the extracted feature maps after 1×1
convolutions is abstracted as:
𝚿=𝑓𝜽𝐿(𝐗0) ∈ 𝑐𝐿×𝑛×𝑚,(6)
where 𝚿is the feature maps extracted from the top branch,
𝑓𝜃𝐿()represents all operations from input images to the
extracted feature maps in the top branch, with a collection of
trainable parameters into 𝜃𝐿, and 𝑐𝐿(subscript 𝐿represents
“local”) is the channel number in the extracted feature maps.
摘要:

ImprovingConvolutionalNeuralNetworksforFaultDiagnosisbyAssimilatingGlobalFeaturesSaifS.S.Al-Wahaibi1andQiugangLu1;£AbstractDeeplearningtechniqueshavebecomeprominentinmodernfaultdiagnosisforcomplexprocesses.Inparticular,convolutionalneuralnetworks(CNNs)haveshownanappealingcapacitytodealwithmultivari...

展开>> 收起<<
Improving Convolutional Neural Networks for Fault Diagnosis by Assimilating Global Features Saif S. S. Al-Wahaibi1and Qiugang Lu1.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:3.85MB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注