Classiﬁcation and Self-Supervised Regression of Arrhythmic ECG Signals Using Convolutional Neural Networks

2025-04-29 0 0 1.46MB 21 页 10玖币

侵权投诉

Classiﬁcation and Self-Supervised Regression of

Arrhythmic ECG Signals Using Convolutional Neural

Networks

Bartosz Grabowskia, Przemysław Głomba,*, Wojciech Masarczykb, Paweł Pławiakc,a,

Özal Yıldırımd, U Rajendra Acharyae,f,g,h,i, and Ru-San Tanj,k

aInstitute of Theoretical and Applied Informatics, Polish Academy of Sciences,

Bałtycka 5, 44-100 Gliwice, Poland

bWarsaw University of Technology, Pl. Politechniki 1, 00-661 Warsaw, Poland

cDepartment of Computer Science, Faculty of Computer Science and

Telecommunications, Cracow University of Technology, Warszawska 24, 31-155

Krakow, Poland

dDepartment of Software Engineering, Firat University, Elazig, Turkey

eSchool of Science and Technology, Singapore University of Social Sciences,

Singapore

fSchool of Business (Information Systems), Faculty of Business, Education, Law and

Arts, University of Southern Queensland, Australia

gSchool of Engineering, Ngee Ann Polytechnic, 535 Clementi Road, 599489,

Singapore

hDepartment of Bioinformatics and Medical Engineering, Asia University, Taiwan

iResearch Organization for Advanced Science and Technology (IROAST), Kumamoto

University, Kumamoto, Japan

jDepartment of Cardiology, National Heart Centre Singapore, 169609, Singapore

kDuke-NUS Medical School, 169857, Singapore

*Corresponding author: Przemysław Głomb, przemg@iitis.pl

Abstract

Interpretation of electrocardiography (ECG) signals is required for diagnosing cardiac arrhythmia.

Recently, machine learning techniques have been applied for automated computer-aided diagnosis.

Machine learning tasks can be divided into regression and classiﬁcation. Regression can be used for

noise and artifacts removal as well as resolve issues of missing data from low sampling frequency.

Classiﬁcation task concerns the prediction of output diagnostic classes according to expert-labeled in-

put classes. In this work, we propose a deep neural network model capable of solving regression and

classiﬁcation tasks. Moreover, we combined the two approaches, using unlabeled and labeled data,

to train the model. We tested the model on the MIT-BIH Arrhythmia database. Our method showed

high effectiveness in detecting cardiac arrhythmia based on modiﬁed Lead II ECG records, as well as

achieved high quality of ECG signal approximation. For the former, our method attained overall ac-

curacy of 87.33% and balanced accuracy of 80.54%, on par with reference approaches. For the latter,

arXiv:2210.14253v1 [cs.LG] 25 Oct 2022

application of self-supervised learning allowed for training without the need for expert labels. The

regression model yielded satisfactory performance with fairly accurate prediction of QRS complexes.

Transferring knowledge from regression to the classiﬁcation task, our method attained higher overall

accuracy of 87.78%.

Keywords: ECG signal classiﬁcation; Cardiac arrhythmia detection; ECG signal approximation; Deep

Convolutional Neural Networks; Self-supervised learning.

1 Introduction

Electrocardiography (ECG) reads out a spatial map of the time-varying electrical potentials of the heart

acquired using electrodes placed at speciﬁc locations on the surface of the body. Interpretation of the

ECG unveils structural and functional abnormalities of the heart that can aid the noninvasive diagnosis

of cardiovascular diseases [31, 15, 63]. Importantly, ECG is the most important diagnostic tool for ar-

rhythmia detection [38, 61, 44, 5, 53]. As the abnormal heart beats often occur sporadically and are not

present at all times, ECG recordings may have to be carried out repeatedly or continuously over an ex-

tended period of time, e.g., days with ambulatory Holter devices [15]. Due to the high signal data volume,

manual interpretation is time-consuming and susceptible to fatigue-induced error. This has spurred the

introduction of automated computer-aided diagnostic systems, which may be based on machine learning.

Some machine learning techniques are able to evaluate individual heartbeat signals on ECG records [27]

to complete tasks like classiﬁcation, localization and prediction. Among the many explored applications

of machine learning for ECG signal analysis, two general problems stand out: regression and classiﬁca-

tion. Regression is a quantitative prediction task that maps the input data into output consisting of real

or continuous values. For ECG data, regression problem can take various forms, including segmentation

method for detecting ECG P, Q, R, S, and T waves [4]; reference method for removing noise artifacts

from ECG signals [19]; and increasing the spatial resolution of ECG through lead prediction [41]. On

the other hand, classiﬁcation is a predictive technique that maps the input data to output data (targets,

classes or categories) to arrive at the correct class labels to which the input should belong. Examples of

works published on ECG dataset classiﬁcation include labeling heartbeats into one of the ﬁve beat classes

according to the ANSI/AAMI EC57:1998 standard [43]; classiﬁcation ECG segments into normal and

multiple arrhythmia classes [1]; and classiﬁcation of myocardial infarction [6]. In many situations en-

countered in automated analyses, regression and classiﬁcation tasks are intertwined, as the former can

be used to enhance the performance of the latter, e.g., by mitigating ECG degradation from noise and

artifacts as well as missing data from low sampling frequency [14]. In this article, we present a novel

approach of ECG signal modeling that uses a convolutional neural network (CNN) for both regression

and classiﬁcation. One of the advantages of our method is the ﬂexibility of neural networks, making it

possible to adapt a single neural network architecture for multiple tasks. We have exploited this trait to

develop a single neural network model that is capable of modeling large parts of the input ECG signal

as well as classifying the same data. Moreover, unlike classiﬁcation, which requires training the model

on expert-annotated ECG signal data, the model can accomplish the regression task without the need for

data labeling. What is also worth noting, the knowledge gained from the self-supervised regression task

can be seamlessly transferred to the downstream classiﬁcation task. This two-pronged approach offers

optionality that may improve diagnostic classiﬁcation at little additional computational cost. On the other

hand, the disadvantage of our approach is the relatively complex method with a lot of hyperparameters

that needs tuning, which can require a lot of time and computational resources to optimize. In summary,

the novelty of our work is as follows:

1. We propose an algorithm that is able to achieve good results at two different tasks, regression and

classiﬁcation, both of which are important components for development of automated computer-

aided systems for ECG signal analysis.

2. Combining within the same CNN model the dual tasks of predicting parts of the ECG signal (re-

gression) and detection of cardiac arrhythmia (classiﬁcation) —which use unlabeled and labeled

data, respectively, for training— may improve model performance without increasing model com-

plexity inordinately.

The paper is structured as follows. Section 1.1 presents the related work. Section 2 introduces materials

and methods, and Section 3 summarizes and discusses the results of the conducted experiments. Section 4

concludes the paper.

1.1 Related work

Classiﬁcation of ECG signals The process of classiﬁcation of ECG signals is traditionally split into

feature extraction and classiﬁcation steps. The feature vectors obtained in the feature extraction stage

are fed into classiﬁers. In automated ECG classiﬁcation, features such as morphology and intervals of

speciﬁc waves on the ECG signal have been widely used in the past. In [57], QRS wave width, amplitude,

and offset, T-wave slope, and prematurity features were obtained for each beat, and the classiﬁcation was

made using a neural network. In [43], the authors proposed a detection approach using morphologic fea-

tures, such as QRS wave onset/offset and T-wave offset, heartbeat intervals, e.g., RR intervals, features for

automated ECG beat classiﬁcation using linear discriminant-based classiﬁer models. Features may also

be transformed using methods like Fourier and wavelet transformation prior to classiﬁcation. In [18], the

authors obtained distinctive features for ECG signals using the Fourier and wavelet transform approach.

With a hybrid neural network model, they reported 96% accuracy in ten-class ECG beat recognition us-

ing these transformed features. In another study [60] that used discrete wavelet transform, the features

obtained from the coefﬁcients of different wavelet levels were classiﬁed using extreme learning machines

architecture. In [20], the authors used auto-regressive model coefﬁcients, third-order cumulants, and dis-

crete wavelet transform approaches to extract ECG signal feature vectors, which were classiﬁed using

fuzzy-hybrid neural networks. Various statistical methods can be used for feature extraction, including

principal component analysis, linear discriminant analysis, independent component analysis, higher order

statistic, and transformation methods. Martis et al. [36] used principal component, linear discriminant,

and independent component analyses in the feature extraction step of their ECG beat classiﬁcation model.

The authors in [42] performed ECG beat classiﬁcation by feeding second, third and fourth order features

to a hybrid fuzzy neural network with higher order statistic. Acharya et al. [2] extracted higher order

statistic bispectrum and cumulant features from ECG signals to classify coronary artery disease using a

k-nearest neighbor (KNN) classiﬁer. As some feature extraction methods can generate large-sized feature

vectors, various approaches can be used to select the most distinctive features to reduce the dimension-

ality. In [45], extracted features were selected using genetic algorithm, which were then fed to a support

vector machine (SVM)-based classiﬁer. In [35], a sequential forward ﬂoating search was used to select

features, which were then classiﬁed using the multilayer perceptron approach.

Several authors have used deep models for analysis of physiological signals, including for ECG ar-

rhythmia classiﬁcation [21, 40, 25], and the number of publications on deep learning models has in-

creased signiﬁcantly in the last few years [21]. In [1], the authors developed a CNN to classify normal

and arrhythmia ECG segments that did not require QRS detection. The 12-layer CNN model was also

used by [55] to classify the ﬁve micro-classes of heartbeat types. In [61], the authors proposed a one-

dimensional (1D)-CNN to process long-duration ECG signal fragments, which was computationally efﬁ-

cient and highly accurate. The authors of [6] used a CNN to classify myocardial infarction using 12-lead

ECG signals, and achieved 99% accuracy. Long short-term memory (LSTM) network was used by [22]

to detect atrial ﬁbrillation based on heart rate signals. The authors partitioned the data with a sliding

window of 100 beats and used the resulting signal segments to train and evaluate the network. In [58],

the authors used a convolutional auto-encoder to reduce the ECG signal size and a LSTM network to

process the compressed data for arrhythmia detection. Compression of the signal minimized storage re-

quirement and data transfer costs, and was able to reduce training time of the LSTM network without

signiﬁcantly compromising model performance. In [34], the authors combined CNN and LSTM network

for three-class classiﬁcation of coronary artery disease, myocardial infarction and congestive heart failure

using ECG data. Authors of [64] proposed three different neural network architectures for classiﬁcation

of ECG signals and obtained the best results for the architecture containing entropy features, while the

one without it had the highest computational efﬁciency.

Regression of ECG signals Regression techniques can be used to rectify ECG signal issues, such as

noise and artifacts, as well as missing data from low sampling frequency [14], which may affect perfor-

mance in arrhythmia classiﬁcation. Various techniques can be used for removing noise and artifacts in

ECG signals, including regression, interpolation, and deep learning. In [49], the authors used a regression-

based model to remove motion artifacts caused by cardiopulmonary resuscitation in the ECG signal output

of automatic external deﬁbrillators to improve the rhythm detection algorithm. Sidek et al [52] used cubic

Hermite and piecewise cubic spline interpolation approaches to improve ECG signal quality, and reported

that the improved quality of the signals conferred higher performance for biometric matching. Similarly,

Kamata et al. [30] proposed a just-in-time interpolation approach to reduce signal artifacts to facilitate

accurate R wave detection, which is of fundamental importance for heart rate variability analysis. Apart

from denoising for signal enhancement, Nallikuzhy et al. [41] proposed a multiscale linear regression

model that was able to increase ECG spatial resolution. The regression approach is also used in segmen-

tation tasks. Aspuru et al. [4] used linear regression to parse the P, Q, R, S and T wave regions of ECG

signals for downstream classiﬁcation.

Autoencoder algorithms, a deep learning method, have also been used in the compression and re-

construction as well as denoising of ECG signals. Yildirim et al. [59] proposed a deep autoencoder that

could reduce the original ECG signal input size to improve model efﬁciency, and reconstruct the original

ECG signal at the output. These structures have also been used to denoise ECG signals and enhance their

quality [56, 13]. In [56], the authors designed a denoising autoencoder model with wavelet transform for

noise removal. Recurrent structures have also been actively used for ECG denoising [48]. Generative

adversarial networks (GAN) have also been frequently employed for similar purposes. Antczak [3] used

the GAN approach to generate synthetic ECG signals, which were used for training noise removing mod-

els. Golany et al. [23] used the synthetic data produced by the GAN approach to increase the accuracy of

heartbeat classiﬁcation.

Self-supervised learning Self-supervised learning strategies have recently gained popularity in ma-

chine learning-based diagnosis of medical images [10, 29], electroencephalography signals (EEG) [7, 62,

8], and ECG signals [50, 37]. This learning strategy can learn from unlabeled data and does not require a

supervisor annotated dataset. While self-supervised learning strategy per se may not improve the accuracy

signiﬁcantly compared with labeled data learning, it has some important advantages [26]. Self-supervised

methods have a structure that can self-learn through data without the need for class labels. As such, they

circumvent the need to annotate large amounts of data by experts in conventional deep learning. This is

especially useful because of the limitations of computational resources and the scarcity of available data

for research [7, 10].

Many prior works on self-supervised learning have been in the ﬁeld of natural language process-

ing. In [16], two approaches to utilize unlabeled data for pretraining of the recurrent neural networks

were tested. The ﬁrst approach was to predict what comes next in a sequence, while the second used

an autoencoder to learn effective encoding of the input sequence. The authors demonstrated that both

approaches helped stabilize the training as well as improved generalization of the LSTM. In [28], the

authors proposed a universal language model ﬁne-tuning method that utilized both general-domain as

well as target task-speciﬁc language models to pretrain the LSTM model, which was then ﬁne-tuned for

the target text classiﬁcation task. They proposed novel ﬁne-tuning techniques to prevent catastrophic

forgetting as well as enable robust learning. In [46], the authors used a transformer architecture [54]

to solve multiple natural language understanding tasks. The model was ﬁrst pretrained on a language

modeling task using a large unlabeled dataset. Next, the network was ﬁne-tuned to solve one of many

speciﬁc target tasks, using task-speciﬁc input transformations where necessary. In [47, 9], transformer

language models were shown to be able to learn the target tasks without being explicitly trained to per-

form them. The networks were thus able to work in a fully unsupervised fashion: learning to perform

other natural language tasks after being trained to do a language modeling task. In [17], the authors pro-

posed a deep bidirectional transformer model, which was pretrained on two different unsupervised tasks:

masked language modeling, where some percentage of the input was hidden at random and the task was

to predict the hidden parts; and the next sentence prediction task, where the model must predict if the

second received sentence was the one following the ﬁrst sentence. In [12], the authors applied the unsu-

pervised pretraining methodology that was common in natural language processing to computer vision.

They performed ImageNet classiﬁcation in three steps: unsupervised pretraining, which was based on the

approach presented in [11]; supervised ﬁne-tuning; and distillation with unlabeled data.

In this work, we use the same dataset preparation steps as in [61]. Moreover, we utilize self-supervised

learning, which was used in e.g. [50, 37] to improve the classiﬁcation accuracy of our model.

2 Material and methods

We designed a CNN that could accomplish the dual tasks of ECG beat regression and ECG segment

arrhythmia classiﬁcation. First, the network architectural requirements were posed as an optimization

problem, which we solved by using Ray Tune library [33] to choose the best performing conﬁgurations.

Neural networks with different permutations of user-deﬁned hyperparameters were trained on the re-

gression followed by classiﬁcation tasks. The value of validation loss of individual architectures with

speciﬁc combinations of hyperparameter settings was expressed as scores. The aim of the optimization

process was to ﬁnd models with the lowest score for the classiﬁcation task. For optimization training,

the following options for hyperparameter settings were considered: batch size, 50 and 100; the number

of convolutional layers, 3, 5 and 7 (these were common to both regression and classiﬁcation tasks); the

number of channels in the ﬁrst layer, 8 and 16 (this number was doubled in every successive layer until

the maximum, 128); kernel size of the ﬁrst layer, 64 and 128 (this number was halved in every successive

layer until the minimum, 2); kernel size of max-pooling layers, 3 and 4; inclusion of batch normalization

layer, yes and no; number of classiﬁcation layers, 1 and 3 (which were added in the classiﬁcation task);

the number of neurons in the classiﬁcation layer(s) or the last layer in the case of regression, 1000 and

3000; inclusion of residual connections from all convolutional layers to the ﬁrst classiﬁcation layer, yes

and no. Async successive halving algorithm scheduler [32] was used for hyperparameter optimization,

which works by evaluating multiple model conﬁgurations and dropping not promising ones based on ini-

tial training performance. This allows for time and resource effective search given large hyperparameter

space. The hyperparameter search was carried out for approximately 12 days. Upon completion of the

optimization, the best architectures and training processes from those evaluated were chosen.

2.1 Architecture

The optimized CNN architecture (Figure 1) comprised seven convolutional layers that were each followed

by rectiﬁed linear units activation, batch normalization and max pooling (pooling size = 4, stride = 2).

Starting with a kernel size of 128 in the ﬁrst convolutional layer, the values for successive layers were pro-

gressively halved until 2 in the seventh layer. The number of channels in the ﬁrst layer was 16, which was

doubled with each succeeding layer until a maximum of 128 channels in the fourth through seventh layers.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ClassicationandSelf-SupervisedRegressionofArrhythmicECGSignalsUsingConvolutionalNeuralNetworksBartoszGrabowskia,PrzemysawGomba,*,WojciechMasarczykb,PawePawiakc,a,ÖzalYldrmd,URajendraAcharyae,f,g,h,i,andRu-SanTanj,kaInstituteofTheoreticalandAppliedInformatics,PolishAcademyofSciences,Batycka5...

展开>> 收起<<

Classiﬁcation and Self-Supervised Regression of Arrhythmic ECG Signals Using Convolutional Neural Networks.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Classiﬁcation and Self-Supervised Regression of Arrhythmic ECG Signals Using Convolutional Neural Networks

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: