Improving Convolutional Neural Networks for Fault Diagnosis by
Assimilating Global Features
Saif S. S. Al-Wahaibi1and Qiugang Lu1,†
Abstract— Deep learning techniques have become prominent
in modern fault diagnosis for complex processes. In particular,
convolutional neural networks (CNNs) have shown an appealing
capacity to deal with multivariate time-series data by con-
verting them into images. However, existing CNN techniques
mainly focus on capturing local or multi-scale features from
input images. A deep CNN is often required to indirectly
extract global features, which are critical to describe the
images converted from multivariate dynamical data. This paper
proposes a novel local-global CNN (LG-CNN) architecture
that directly accounts for both local and global features for
fault diagnosis. Specifically, the local features are acquired by
traditional local kernels whereas global features are extracted
by using 1D tall and fat kernels that span the entire height
and width of the image. Both local and global features are
then merged for classification using fully-connected layers. The
proposed LG-CNN is validated on the benchmark Tennessee
Eastman process (TEP) dataset. Comparison with traditional
CNN shows that the proposed LG-CNN can greatly improve the
fault diagnosis performance without significantly increasing the
model complexity. This is attributed to the much wider local
receptive field created by the LG-CNN than that by CNN. The
proposed LG-CNN architecture can be easily extended to other
image processing and computer vision tasks.
I. INTRODUCTION
Deep learning (DL) has attracted increasing attention for
fault detection and diagnosis (FDD) over the last decade.
Primarily, the strength of DL lies in its ability to utilize the
extensive data present in industrial systems to establish com-
plex models for distinguishing anomalies, diagnosing faults,
and forecasting without needing much prior knowledge [1].
Among various DL methods for FDD, convolutional neural
networks (CNNs) have shown great promise due to their effi-
ciency in capturing spatiotemporal correlations and reduced
trainable parameters from weight sharing [2].
Originally developed for image classification, CNNs entail
neural networks consisting of convolutions with local kernels
and pooling operations to extract features from images [2],
[3]. They have also been used in FDD to handle time-series
data. Janssens 𝑒𝑡 𝑎𝑙. [4] made one of the earliest attempts at
using CNNs for fault diagnosis. The authors highlighted the
capability of CNNs to learn new features from input images
converted from time-series data to better classify faults in
rotating machinery. Further developments on CNN for FDD
can be referred to [5]–[10]. Note that different from images
in computer vision, the images converted from time-series
*This work was supported by the Texas Tech University
1S. Al-Wahaibi and Q. Lu are with the Department of Chemical En-
gineering, Texas Tech University, Lubbock, TX 79409, USA. Email:
Saif.Al-Wahaibi@ttu.edu; Jay.Lu@ttu.edu
†Corresponding author: Q. Lu
data often possess strong non-localized features. To this end,
kernels of different sizes, i.e., multi-scale CNN, have been
used in [5] to cover local receptive fields (LRF) with varying
resolutions to improve the learned features. Other techniques
such as global average pooling has been employed in [10],
[11] to maintain the integrity of information pertaining to
global correlations. However, these approaches either can
only directly capture wider local features (e.g., multi-scale
CNN) or lack learnable parameters in acquiring global corre-
lations (e.g., global average pooling). Thus, they often need
to construct deep networks to capture the global features that
are crucial in multivariate time-series data for FDD [12]. In
addition, most research studies mentioned above are mainly
concerned with 1D or low-dimensional time-series data such
as the wheel bearing data [6]. Research on extending CNN
for FDD for high-dimensional multivariate time-series data,
e.g., those obtained from chemical processes, still remains
limited. One exemplary work is reported in [7] where deep
CNNs are constructed to diagnose faults from the Tennessee
Eastman process (TEP). Gramian angular field is used in
[8] to convert multi-dimensional data into multi-channel 2D
images to apply CNN for fault diagnosis. Nevertheless, these
works still cannot directly extract global features from the
multivariate time-series data or equivalently, the formed 2D
images. Instead, they also rely on constructing deep CNNs to
expand the LRF to the entire image for capturing global cor-
relations. As a result, the number of trainable parameters can
easily go beyond several millions, causing significant training
complexity [7], [8]. Hence, there is a pressing demand
for developing a novel CNN-based FDD framework that
adequately extracts global spatiotemporal correlations while
maintaining a reasonable number of learnable parameters for
multivariate time-series datasets.
This paper proposes a novel local-global CNN (LG-CNN)
framework for fault diagnosis for complex dynamical pro-
cesses. The proposed framework converts multivariate time-
series data into images and simultaneously collects both
global and local features to classify faults. Local correlations
are captured using typical local square kernels, whereas
global correlations are integrated using 1D tall (temporal)
and fat (spatial) kernels that span the entire height and width
of the image. The spatial and temporal global features ex-
tracted from the tall and fat kernels are then cohered together
to acquire global spatiotemporal patterns in the images. Such
global spatiotemporal features are then concatenated with
local features extracted with the typical square kernels to
merge the information prior to fault diagnosis. The developed
LG-CNN is validated with the TEP data and simulation
arXiv:2210.01077v1 [cs.CV] 3 Oct 2022