Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

2025-05-08 0 0 658.63KB 14 页 10玖币

侵权投诉

Improving Medical Predictions by Irregular Multimodal Electronic Health

Records Modeling

Xinlu Zhang * 1 Shiyang Li * 1 Zhiyu Chen 1Xifeng Yan 1Linda Petzold 1

Abstract

Health conditions among patients in intensive care

units (ICUs) are monitored via electronic health

records (EHRs), composed of numerical time se-

ries and lengthy clinical note sequences, both

taken at irregular time intervals. Dealing with

such irregularity in every modality, and integrat-

ing irregularity into multimodal representations

to improve medical predictions, is a challenging

problem. Our method ﬁrst addresses irregularity

in each single modality by (1) modeling irregular

time series by dynamically incorporating hand-

crafted imputation embeddings into learned inter-

polation embeddings via a gating mechanism, and

(2) casting a series of clinical note representations

as multivariate irregular time series and tackling

irregularity via a time attention mechanism. We

further integrate irregularity in multimodal fusion

with an interleaved attention mechanism across

temporal steps. To the best of our knowledge, this

is the ﬁrst work to thoroughly model irregularity

in multimodalities for improving medical predic-

tions. Our proposed methods for two medical

prediction tasks consistently outperforms state-of-

the-art (SOTA) baselines in each single modality

and multimodal fusion scenarios. Speciﬁcally,

we observe relative improvements of 6.5%, 3.6%,

and 4.3% in F1 for time series, clinical notes, and

multimodal fusion, respectively. These results

demonstrate the effectiveness of our methods and

the importance of considering irregularity in mul-

timodal EHRs. 1.

Equal contribution

University of California, Santa Barbara.

Correspondence to: Xinlu Zhang <xinluzhang@ucsb.edu>.

Proceedings of the

40 th

International Conference on Machine

Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright

2023 by the author(s).

Our code is released at

https://github.com/

XZhang97666/MultimodalMIMIC

Heart Rate

Temperature

Glucose

Clinical notes

0:59 AM: 86

9:32 AM: … 62

year old man with

diffuse rales…

3:59 AM: 36.6

5:48 AM: 128.0

MISTS

Figure 1: An example of a patient’s ICU stay includes

MISTS with three features and a series of clinical notes.

For MISTS, heart rate and temperature are monitored regu-

larly with different frequencies, and glucose is a laboratory

test ordered at irregular time intervals based on doctors’

decisions. Clinical notes are free text, collected with much

sparser irregular time points than clinical measurements.

1. Introduction

ICUs admit patients with life-threatening conditions, e.g.

trauma (Tisherman & Stein,2018), sepsis (Alberti et al.,

2002), and organ failure (Afessa et al.,2007). Care in the

ﬁrst few hours after admission is critical to patient outcomes.

This period is also more prone to medical decision errors

than later times (Otero-L

opez et al.,2006). Automated

tools with effective and real-time predictions can be much

beneﬁcial in assisting clinicians in providing appropriate

treatments. Recently, the health conditions of patients in

ICUs have been recorded in EHRs (Adler-Milstein et al.,

2015), bringing the possibility of applying deep neural net-

works to healthcare (Xiao et al.,2018;Shickel et al.,2017),

e.g. mortality prediction (Zhang et al.,2021a) and pheno-

type classiﬁcation (Harutyunyan et al.,2019). EHRs contain

multivariate irregularly sampled time series (MISTS) and

irregular clinical note sequences, as shown in Figure 1. The

multimodal structure and complex irregular temporal nature

of the data present challenges for prediction. This leads us

to formulate two research objectives:

1. Tackling irregularity in both time series and

clinical notes

2. Integrating irregularity into multimodal repre-

sentation learning

arXiv:2210.12156v2 [cs.LG] 5 Jun 2023

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

To the best of our knowledge, none of the existing works has

fully considered irregularity in multimodal representation

learning.

We observed three major drawbacks for irregular multi-

modal EHRs modeling in existing works. 1) MISTS models

perform diversely. While the numerous MISTS models have

been proposed to tackle irregularity (Lipton et al.,2016;

Shukla & Marlin,2019;2021;Zhang et al.,2021b;Horn

et al.,2020;Rubanova et al.,2019), none of the approaches

consistently outperforms the others. Even among Temporal

discretization-based embedding (TDE) methods, including

hand-crafted imputation (Lipton et al.,2016) and learned

interpolation (Shukla & Marlin,2019;2021), which trans-

form MISTS into regular time representations to interface

with deep neural networks for regular time series, there is

no clear superior approach. 2) Irregularity in clinical notes

is not well tackled. Most existing works (Golmaei & Luo,

2021;Mahbub et al.,2022) directly concatenate all clinical

notes of each patient but ignore the note-taking time infor-

mation. Although Zhang et al. (2020) proposes an LSTM

variant to model time decay among clinical notes, this ap-

proach utilizes only a few trainable parameters, which could

be less powerful. 3) Exiting works ignore irregularity in

multimodal fusion. Deznabi et al. (2021); Yang et al. (2021)

have demonstrated the effectiveness of combining time se-

ries and clinical notes for medical prediction tasks, however

these works are deployed only on multimodal data without

considering irregularity. Their fusion strategies may not be

able to fully integrate irregular time information into multi-

modal representations, which can be essential for prediction

performance in real-world scenarios.

Our Contributions. To tackle the aforementioned issues,

we separately model irregularity in MISTS and irregular

clinical notes, and further integrate multimodalities across

temporal steps, so as to provide powerful medical predic-

tions based on the complicated irregular time pattern and

multimodal structure of EHRs. Speciﬁcally, we ﬁrst show

that different TDE methods of tackling MISTS are comple-

mentary for medical predictions, by introducing a gating

mechanism that incorporates different TDE embeddings

speciﬁc to each patient. Secondly, we cast note representa-

tions and note-taking time as MISTS, and leverage a time

attention mechanism (Shukla & Marlin,2021) to model

the irregularity in each dimension of note representations.

Finally, we incorporate irregularity into multimodal rep-

resentations by adopting a fusion method that interleaves

self-attentions and cross-attentions (Vaswani et al.,2017) to

integrate multimodal knowledge across temporal steps. To

the best of our knowledge, this is the ﬁrst work for a uniﬁed

system that fully considers irregularity to improve medical

predictions, not only in every single modality but also in

multimodal fusion scenarios. Our approach demonstrates

superior performance compared to baselines in both single

modality and multimodal fusion scenarios, with notable rel-

ative improvements of 6.5%, 3.6%, and 4.3% in terms of

F1 for MISTS, clinical notes, and multimodal fusion, re-

spectively. Our comprehensive ablation study demonstrates

that tackling irregularity in every single modality beneﬁts

not only their own modality but also multimodal fusion.

We also show that modeling long sequential clinical notes

further improves medical prediction performance.

2. Related Work

Multivariate irregularly sampled time series (MISTS).

MISTS refer to observations of each variable that are ac-

quired at irregular time intervals and can have misaligned

observation times across different variables (Zerveas et al.,

2021). GRU-D (Che et al.,2018) captures temporal depen-

dencies by decaying the hidden states in gated recurrent

units. SeFT (Horn et al.,2020) represents the MISTS to

a set of observations based on differentiable set function

learning. ODE-RNN (Rubanova et al.,2019) uses latent

neural ordinary differential equations (Chen et al.,2018)

to specify hidden state dynamics and update RNN hidden

states with a new observation. RAINDROP (Zhang et al.,

2021b) models MISTS as separate sensor graphs and lever-

ages graph neural networks to learn the dependencies among

variables. These approaches model irregular temporal de-

pendencies in MISTS from different perspectives through

specialized design. TDE methods are a subset of methods

for handling MISTS, converting them to ﬁxed-dimensional

feature spaces, and feeding regular time representations

into deep neural models for regular time series. Imputa-

tion methods (Lipton et al.,2016;Harutyunyan et al.,2019;

McDermott et al.,2021) are straightforward TDE methods

to discretize MISTS into regular time series with manual

missing values imputation, but these ignore the irregularity

in the raw data. To ﬁll this gap, Shukla & Marlin (2019)

presents interpolation-prediction networks (IP-Nets) to inter-

polate MISTS at a set of regular reference points via a kernel

function with learned parameters. Shukla & Marlin (2021)

further presents a time attention mechanism with time em-

beddings to learn interpolation representations. However,

learned interpolation strategies do not always outperform

simple imputation methods. This may be due to compli-

cated data sampling patterns (Horn et al.,2020). Inspired

by Mixture-of-Experts (MoE) (Shazeer et al.,2017;Jacobs

et al.,1991), which maintains a set of experts (neural net-

works) and seeks a combination of the experts speciﬁc to

each input via a gating mechanism, we leverage different

TDE methods as submodules and integrate hand-crafted im-

putation embeddings into learned interpolation embeddings

to improve medical predictions.

Irregular clinical notes modeling. (Golmaei & Luo,2021;

Mahbub et al.,2022) concatenate each patient’s clinical

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

…

MISTS

Imputation

mTAND"#

TextEncoder mTAND$%"

Multimodal Fusion

MH"#

MH"%"

×$

CMH"#

CMH"%"

Classifier

Gate

Irregular Clinical Notes

UTDE

Figure 2: The model architecture, which encodes MISTS and clinical notes separately, and then performs a multimodal

fusion.

UTDE

is a gating mechanism to obtain MISTS representations by dynamically fusing embeddings of imputation

and a time attention module,

mTANDts

. Irregular clinical notes are encoded by a pretrained language model,

TextEncoder

whose outputs are fed into

mTANDtxt

to obtain text interpolation representations. The multimodal fusion strategy contains

identical layers. Each layer interleaves self-attentions (

) and cross-attentions (

CMH)

to integrate representations from

multimodalities and incorporate irregularity into multimodal representations. A classiﬁer with fully connected layers is used

to predict patient outcomes.

notes, divide them into blocks, and then obtain text rep-

resentations by feeding a series of note blocks into BERT

(Devlin et al.,2018) variants (Huang et al.,2019;Gu et al.,

2021), ignoring the irregularity in clinical notes. Zhang

et al. (2020) further proposes a time-awarded LSTM with

trainable decay function to model irregular time information

among clinical notes. However, this approach can be less

powerful due to limited parameters. To fully model irreg-

ularity, we cast clinical note representations with irregular

note-taking time as MISTS, such that each dimension of a

series of clinical note representations is an irregular time

series, and perform a time attention mechanism (Shukla &

Marlin,2021) to further model the irregularity.

Multimodal fusion. Combining both time series and clin-

ical notes outperforms the results obtained when only one

of them is used (Liu et al.,2021). Khadanga et al. (2019);

Deznabi et al. (2021); Yang et al. (2021) directly concatenate

representations from different modalities for downstream

predictions. Yang & Wu (2021) utilizes an attention gate

to fuse multimodal information. (Xu et al.,2021) selects

multimodal fusion strategies from addition, concatenation

and multiplication by a neural architecture search method.

However, these fusion methods are only performed on EHRs

without considering irregularity, failing to fully incorporate

time information into multimodal representations, which

is critical in real-world scenarios. To ﬁll this gap, we ﬁrst

tackle irregularity in time series and clinical notes, respec-

tively, and further leverage fusion module, which interleaves

self-attentions and cross-attentions (Vaswani et al.,2017) to

obtain multimodal interaction integrated with irregularity

across temporal steps.

3. Method

Our method models irregularity in three portions: MISTS,

clinical notes, and multimodal fusion, as shown in Figure 2.

In this section, we will illustrate each part thoroughly.

3.1. Problem setup

Denote

D={(xts

i,tts

i),(xtxt

i,ttxt

i),yi}N

i=1

to be an EHR

dataset with N patients, where

(xts

i,tts

-dimensional

MISTS,

xts

being observations and

tts

being corresponding

time points,

(xtxt

i,ttxt

is a series of clinical notes with

note-taking time and

is the target outcome, e.g. discharge

or death for modality prediction. In the following part, we

drop the patient index

for simplicity. Each dimension

of the MISTS,

(xts

j,tts

, where

j= 1,· · · , dm

, has

lts

observations, and each patient’s

(xtxt,ttxt)

includes

ltxt

clinical notes. In early-stage medical predictions, given

(xts,tts)

and

(xtxt,ttxt)

before a certain time point (e.g.

48-hour) after admission,

, we seek to predict

for every

patient.

3.2. MISTS

3.2.1. TDE METHODS

We will describe two TDE methods to facilitate the intro-

duction of our proposed MISTS embedding approach. An

illustration is shown in Figure 3for better understanding.

Imputation. We ﬁrst discretize

xts

based on

tts

, to

hourly time intervals with a sequence of regular time points,

α= [0,1,· · · , α −1]

. Then, for each feature, we use the

last observation, if multiple observations are in the same

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ImprovingMedicalPredictionsbyIrregularMultimodalElectronicHealthRecordsModelingXinluZhang*1ShiyangLi*1ZhiyuChen1XifengYan1LindaPetzold1AbstractHealthconditionsamongpatientsinintensivecareunits(ICUs)aremonitoredviaelectronichealthrecords(EHRs),composedofnumericaltimese-riesandlengthyclinicalnoteseque...

展开>> 收起<<

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: