ECG for high-throughput screening of multiple diseases Proof-of-concept using multi-diagnosis deep learning from population-based datasets

2025-04-26 0 0 2.7MB 6 页 10玖币

侵权投诉

ECG for high-throughput screening of multiple

diseases: Proof-of-concept using multi-diagnosis deep

learning from population-based datasets

Weijie Sun1,2Sunil Vasu Kalmady1,3Amir Salimi2Nariman Sepehrvand1

Eric Ly1Abram Hindle2Russell Greiner2,3Padma Kaul1

1Canadian VIGOUR Centre, Department of Medicine, University of Alberta, Alberta, Canada

2Department of Computing Science, University of Alberta, Alberta, Canada

3Alberta Machine Intelligence Institute, Alberta, Canada

weijie2@ualberta.ca kalmady@ualberta.ca

Abstract

Electrocardiogram (ECG) abnormalities are linked to cardiovascular diseases, but

may also occur in other non-cardiovascular conditions such as mental, neurological,

metabolic and infectious conditions. However, most of the recent success of deep

learning (DL) based diagnostic predictions in selected patient cohorts have been

limited to a small set of cardiac diseases. In this study, we use a population-based

dataset of >250,000 patients with >1000 medical conditions and >2 million ECGs

to identify a wide range of diseases that could be accurately diagnosed from the

patient’s ﬁrst in-hospital ECG. Our DL models uncovered 128 diseases and 68

disease categories with strong discriminative performance.

1 Introduction

Electrocardiogram (ECG) captures the propagation of the electrical signal in the heart and is one of

the most routinely used non-invasive modalities in healthcare to diagnose cardiovascular diseases [

However, ECG signals can be complex, making it challenging and time-consuming to interpret, even

for experts. In recent years, deep learning (DL) models have been successful in reaching near human

levels of performance, however most of these studies have been limited to typical ECG abnormalities

such as arrhythmias [

] and a limited set of heart diseases including valvulopathy, cardiomyopathy,

and ischaemia [16].

Several clinical studies have shown strong associations of ECG abnormalities with numerous diseases

beyond cardiovascular conditions, including but not limited to mental disorders : depression [

bipolar disorder [

]; infectious conditions : HIV [

], sepsis [

]; metabolic diseases : diabetes

type 2 [

], amyloidosis [

]; drug use : psychotropics [

], cannabis [

]; neurological disorders:

Alzheimer disease [

], cerebral palsy [

]; respiratory diseases : pneumoconiosis [

], chronic

obstructive pulmonary disease [

]; digestive system diseases : liver cirrhosis [

], alcoholic liver

disease [

]; miscellaneous conditions: chronic kidney disease [

], preterm labour [

], systemic

lupus erythematosus [

] etc. However, despite well established clinical associations of ECG changes

with multiple diseases, very few studies have explored the information contained in ECGs that could

be harnessed for prediction of non-cardiovascular conditions. A major challenge here is the lack of

35th Conference on Neural Information Processing Systems (NeurIPS 2021), Sydney, Australia.

arXiv:2210.06291v1 [eess.SP] 6 Oct 2022

availability of large training datasets of digitized ECGs that could be linked to concurrent diagnostic

information across various disease types. In this context, standardized administrative health data,

routinely generated at each encounter, provide a wonderful opportunity to explore the full spectrum

of patient diagnoses. These data include the most responsible diagnosis, as well as any comorbidities

the patient may have or develop during presentation.

In this study, we use a population-based dataset of >250,000 patients with various medical conditions

and >2 million in-hospital ECGs. Here, we use diagnoses coded using the World Health Organization

International Classiﬁcation of Diseases (ICD) [

]. The goal of our study is to identify which diseases

(with previously known or unknown associations with ECGs) can be accurately diagnosed from the

patient’s ﬁrst ECG during an emergency department (ED) visit or hospitalization based on a learned

DL model. It aims to provide a proof-of-concept for high-throughput screening of ICD-wide range of

diseases based on ECG, and presents disease candidates to be explored in future ECG studies with

focused investigation on speciﬁc diagnosis.

2 Method

This study used population-based datasets from 26 hospitals in Alberta, Canada (2007-2020), contain-

ing information on 772,932 healthcare episodes (hospitalization and ED visits) of 260,065 patients

who collectively had 13,179 unique ICD-10 codes/diseases [

]. We linked these episodes to a

dataset of 2,015,808 ECGs (Philips IntelliSpace system, 12-lead, 500 Hz, 10 s) using unique patient

identiﬁers and timing of ECG acquisition. After data cleaning and exclusions (poor signal quality

unlinked episodes, pacemaker and devices, < 18 years old, etc.), we used 1,514,968 ECGs that were

linked to 724,074 episodes of 239,852 patients with 11,207 unique ICD codes. An ICD-10 code is 3 to

7 characters that speciﬁes a speciﬁc disease, where the ﬁrst 3 characters denote the general category of

disease (e.g., ‘I214’ refers to ‘Non-ST elevation (NSTEMI) myocardial infarction’ and ‘I21’ refers to

its broader category ‘Acute myocardial infarction’). We used ICD codes and corresponding categories

as labels for prediction modelling. We found 1,319 ICD codes (full code, exact match) and 699 ICD

categories (match ﬁrst 3 digits) that were each linked to at least 1000 ECGs.

We split our ECG dataset into the internal validation set (random 60%: 143,939 patients with 436,508

ECGs, used for training and internal validation) and external holdout set (remaining 40%: 95,913

patients with 287,566 ECGs), while ensuring that ECGs from the same patient were not shared

between the sets. Whenever there were multiple ECGs in an episode, we used only the ﬁrst ECG

for evaluation, as it would be preferable in actual clinical practice to make a diagnostic prediction

at the ﬁrst point of care in the ED or hospital. We trained two DL models, for full ICD codes and

ICD categories. We ﬁrst trained and evaluated the performance with 80%-20% split within the

internal set, and selected a list of top labels based on discriminative performance (Area under receiver

operating characteristic curve (AUROC)). We then retrained the models on the entire internal set

and evaluated on the external set based on the selected labels. Our DL architecture was based on

ResNet [

], similar to the one used in earlier ECG modeling study [

]. Here, 12-lead ECG traces

were input to the network, consisting of convolutional layer (conv), 4 residual blocks with 2 conv

per block, followed by a dense layer to which age and sex features were concatenated. We used

batch normalization, ReLU and dropout after each conv. The last block is then fed into a dense layer

with sigmoid activation to output a 1319 (resp., 699) length vector of predicted probabilities for the

codes/diseases (resp., categories). We used the Adam optimizer, learning rate of 0.001, batch size of

512, and binary cross entropy as loss function.

3 Results

In our internal validation, we found 369 out of 1319 ICD codes and 170 out of 699 ICD categories

to have AUROC > 80%. Among these, 70 ICD codes and 29 ICD categories had AUROC > 90%.

However, several of these labels had low precision, therefore we restricted the list to the labels with

at least 5% AUPRC (area under precision-recall curve) or with an average precision that is at least

20 times greater than the prevalence of the condition. This yielded 151 ICD codes and 80 ICD

categories with AUROC > 80%; and 52 ICD codes and 18 ICD categories with AUROC > 90%.

Finally, we examined the replication of these lists in the external validation, and found that 128 out of

Trace quality was ensured on muscle artifact, AC noise, baseline wander, QRS clipping, leads-off ﬂags etc.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ECGforhigh-throughputscreeningofmultiplediseases:Proof-of-conceptusingmulti-diagnosisdeeplearningfrompopulation-baseddatasetsWeijieSun1;2SunilVasuKalmady1;3AmirSalimi2NarimanSepehrvand1EricLy1AbramHindle2RussellGreiner2;3PadmaKaul11CanadianVIGOURCentre,DepartmentofMedicine,UniversityofAlberta,Albert...

展开>> 收起<<

ECG for high-throughput screening of multiple diseases Proof-of-concept using multi-diagnosis deep learning from population-based datasets.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ECG for high-throughput screening of multiple diseases Proof-of-concept using multi-diagnosis deep learning from population-based datasets

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: