Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

2025-05-08 0 0 658.63KB 14 页 10玖币
侵权投诉
Improving Medical Predictions by Irregular Multimodal Electronic Health
Records Modeling
Xinlu Zhang * 1 Shiyang Li * 1 Zhiyu Chen 1Xifeng Yan 1Linda Petzold 1
Abstract
Health conditions among patients in intensive care
units (ICUs) are monitored via electronic health
records (EHRs), composed of numerical time se-
ries and lengthy clinical note sequences, both
taken at irregular time intervals. Dealing with
such irregularity in every modality, and integrat-
ing irregularity into multimodal representations
to improve medical predictions, is a challenging
problem. Our method first addresses irregularity
in each single modality by (1) modeling irregular
time series by dynamically incorporating hand-
crafted imputation embeddings into learned inter-
polation embeddings via a gating mechanism, and
(2) casting a series of clinical note representations
as multivariate irregular time series and tackling
irregularity via a time attention mechanism. We
further integrate irregularity in multimodal fusion
with an interleaved attention mechanism across
temporal steps. To the best of our knowledge, this
is the first work to thoroughly model irregularity
in multimodalities for improving medical predic-
tions. Our proposed methods for two medical
prediction tasks consistently outperforms state-of-
the-art (SOTA) baselines in each single modality
and multimodal fusion scenarios. Specifically,
we observe relative improvements of 6.5%, 3.6%,
and 4.3% in F1 for time series, clinical notes, and
multimodal fusion, respectively. These results
demonstrate the effectiveness of our methods and
the importance of considering irregularity in mul-
timodal EHRs. 1.
*
Equal contribution
1
University of California, Santa Barbara.
Correspondence to: Xinlu Zhang <xinluzhang@ucsb.edu>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
1
Our code is released at
https://github.com/
XZhang97666/MultimodalMIMIC
Heart Rate
Temperature
Glucose
Clinical notes
0:59 AM: 86
9:32 AM: … 62
year old man with
diffuse rales…
3:59 AM: 36.6
5:48 AM: 128.0
MISTS
Figure 1: An example of a patient’s ICU stay includes
MISTS with three features and a series of clinical notes.
For MISTS, heart rate and temperature are monitored regu-
larly with different frequencies, and glucose is a laboratory
test ordered at irregular time intervals based on doctors’
decisions. Clinical notes are free text, collected with much
sparser irregular time points than clinical measurements.
1. Introduction
ICUs admit patients with life-threatening conditions, e.g.
trauma (Tisherman & Stein,2018), sepsis (Alberti et al.,
2002), and organ failure (Afessa et al.,2007). Care in the
first few hours after admission is critical to patient outcomes.
This period is also more prone to medical decision errors
than later times (Otero-L
´
opez et al.,2006). Automated
tools with effective and real-time predictions can be much
beneficial in assisting clinicians in providing appropriate
treatments. Recently, the health conditions of patients in
ICUs have been recorded in EHRs (Adler-Milstein et al.,
2015), bringing the possibility of applying deep neural net-
works to healthcare (Xiao et al.,2018;Shickel et al.,2017),
e.g. mortality prediction (Zhang et al.,2021a) and pheno-
type classification (Harutyunyan et al.,2019). EHRs contain
multivariate irregularly sampled time series (MISTS) and
irregular clinical note sequences, as shown in Figure 1. The
multimodal structure and complex irregular temporal nature
of the data present challenges for prediction. This leads us
to formulate two research objectives:
1. Tackling irregularity in both time series and
clinical notes
2. Integrating irregularity into multimodal repre-
sentation learning
1
arXiv:2210.12156v2 [cs.LG] 5 Jun 2023
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
To the best of our knowledge, none of the existing works has
fully considered irregularity in multimodal representation
learning.
We observed three major drawbacks for irregular multi-
modal EHRs modeling in existing works. 1) MISTS models
perform diversely. While the numerous MISTS models have
been proposed to tackle irregularity (Lipton et al.,2016;
Shukla & Marlin,2019;2021;Zhang et al.,2021b;Horn
et al.,2020;Rubanova et al.,2019), none of the approaches
consistently outperforms the others. Even among Temporal
discretization-based embedding (TDE) methods, including
hand-crafted imputation (Lipton et al.,2016) and learned
interpolation (Shukla & Marlin,2019;2021), which trans-
form MISTS into regular time representations to interface
with deep neural networks for regular time series, there is
no clear superior approach. 2) Irregularity in clinical notes
is not well tackled. Most existing works (Golmaei & Luo,
2021;Mahbub et al.,2022) directly concatenate all clinical
notes of each patient but ignore the note-taking time infor-
mation. Although Zhang et al. (2020) proposes an LSTM
variant to model time decay among clinical notes, this ap-
proach utilizes only a few trainable parameters, which could
be less powerful. 3) Exiting works ignore irregularity in
multimodal fusion. Deznabi et al. (2021); Yang et al. (2021)
have demonstrated the effectiveness of combining time se-
ries and clinical notes for medical prediction tasks, however
these works are deployed only on multimodal data without
considering irregularity. Their fusion strategies may not be
able to fully integrate irregular time information into multi-
modal representations, which can be essential for prediction
performance in real-world scenarios.
Our Contributions. To tackle the aforementioned issues,
we separately model irregularity in MISTS and irregular
clinical notes, and further integrate multimodalities across
temporal steps, so as to provide powerful medical predic-
tions based on the complicated irregular time pattern and
multimodal structure of EHRs. Specifically, we first show
that different TDE methods of tackling MISTS are comple-
mentary for medical predictions, by introducing a gating
mechanism that incorporates different TDE embeddings
specific to each patient. Secondly, we cast note representa-
tions and note-taking time as MISTS, and leverage a time
attention mechanism (Shukla & Marlin,2021) to model
the irregularity in each dimension of note representations.
Finally, we incorporate irregularity into multimodal rep-
resentations by adopting a fusion method that interleaves
self-attentions and cross-attentions (Vaswani et al.,2017) to
integrate multimodal knowledge across temporal steps. To
the best of our knowledge, this is the first work for a unified
system that fully considers irregularity to improve medical
predictions, not only in every single modality but also in
multimodal fusion scenarios. Our approach demonstrates
superior performance compared to baselines in both single
modality and multimodal fusion scenarios, with notable rel-
ative improvements of 6.5%, 3.6%, and 4.3% in terms of
F1 for MISTS, clinical notes, and multimodal fusion, re-
spectively. Our comprehensive ablation study demonstrates
that tackling irregularity in every single modality benefits
not only their own modality but also multimodal fusion.
We also show that modeling long sequential clinical notes
further improves medical prediction performance.
2. Related Work
Multivariate irregularly sampled time series (MISTS).
MISTS refer to observations of each variable that are ac-
quired at irregular time intervals and can have misaligned
observation times across different variables (Zerveas et al.,
2021). GRU-D (Che et al.,2018) captures temporal depen-
dencies by decaying the hidden states in gated recurrent
units. SeFT (Horn et al.,2020) represents the MISTS to
a set of observations based on differentiable set function
learning. ODE-RNN (Rubanova et al.,2019) uses latent
neural ordinary differential equations (Chen et al.,2018)
to specify hidden state dynamics and update RNN hidden
states with a new observation. RAINDROP (Zhang et al.,
2021b) models MISTS as separate sensor graphs and lever-
ages graph neural networks to learn the dependencies among
variables. These approaches model irregular temporal de-
pendencies in MISTS from different perspectives through
specialized design. TDE methods are a subset of methods
for handling MISTS, converting them to fixed-dimensional
feature spaces, and feeding regular time representations
into deep neural models for regular time series. Imputa-
tion methods (Lipton et al.,2016;Harutyunyan et al.,2019;
McDermott et al.,2021) are straightforward TDE methods
to discretize MISTS into regular time series with manual
missing values imputation, but these ignore the irregularity
in the raw data. To fill this gap, Shukla & Marlin (2019)
presents interpolation-prediction networks (IP-Nets) to inter-
polate MISTS at a set of regular reference points via a kernel
function with learned parameters. Shukla & Marlin (2021)
further presents a time attention mechanism with time em-
beddings to learn interpolation representations. However,
learned interpolation strategies do not always outperform
simple imputation methods. This may be due to compli-
cated data sampling patterns (Horn et al.,2020). Inspired
by Mixture-of-Experts (MoE) (Shazeer et al.,2017;Jacobs
et al.,1991), which maintains a set of experts (neural net-
works) and seeks a combination of the experts specific to
each input via a gating mechanism, we leverage different
TDE methods as submodules and integrate hand-crafted im-
putation embeddings into learned interpolation embeddings
to improve medical predictions.
Irregular clinical notes modeling. (Golmaei & Luo,2021;
Mahbub et al.,2022) concatenate each patient’s clinical
2
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling
++
!
MISTS
Imputation
mTAND"#
TextEncoder mTAND$%"
Multimodal Fusion
MH"#
MH"%"
×$
CMH"#
CMH"%"
Classifier
Gate
Irregular Clinical Notes
UTDE
Figure 2: The model architecture, which encodes MISTS and clinical notes separately, and then performs a multimodal
fusion.
UTDE
is a gating mechanism to obtain MISTS representations by dynamically fusing embeddings of imputation
and a time attention module,
mTANDts
. Irregular clinical notes are encoded by a pretrained language model,
TextEncoder
,
whose outputs are fed into
mTANDtxt
to obtain text interpolation representations. The multimodal fusion strategy contains
J
identical layers. Each layer interleaves self-attentions (
MH
) and cross-attentions (
CMH)
to integrate representations from
multimodalities and incorporate irregularity into multimodal representations. A classifier with fully connected layers is used
to predict patient outcomes.
notes, divide them into blocks, and then obtain text rep-
resentations by feeding a series of note blocks into BERT
(Devlin et al.,2018) variants (Huang et al.,2019;Gu et al.,
2021), ignoring the irregularity in clinical notes. Zhang
et al. (2020) further proposes a time-awarded LSTM with
trainable decay function to model irregular time information
among clinical notes. However, this approach can be less
powerful due to limited parameters. To fully model irreg-
ularity, we cast clinical note representations with irregular
note-taking time as MISTS, such that each dimension of a
series of clinical note representations is an irregular time
series, and perform a time attention mechanism (Shukla &
Marlin,2021) to further model the irregularity.
Multimodal fusion. Combining both time series and clin-
ical notes outperforms the results obtained when only one
of them is used (Liu et al.,2021). Khadanga et al. (2019);
Deznabi et al. (2021); Yang et al. (2021) directly concatenate
representations from different modalities for downstream
predictions. Yang & Wu (2021) utilizes an attention gate
to fuse multimodal information. (Xu et al.,2021) selects
multimodal fusion strategies from addition, concatenation
and multiplication by a neural architecture search method.
However, these fusion methods are only performed on EHRs
without considering irregularity, failing to fully incorporate
time information into multimodal representations, which
is critical in real-world scenarios. To fill this gap, we first
tackle irregularity in time series and clinical notes, respec-
tively, and further leverage fusion module, which interleaves
self-attentions and cross-attentions (Vaswani et al.,2017) to
obtain multimodal interaction integrated with irregularity
across temporal steps.
3. Method
Our method models irregularity in three portions: MISTS,
clinical notes, and multimodal fusion, as shown in Figure 2.
In this section, we will illustrate each part thoroughly.
3.1. Problem setup
Denote
D={(xts
i,tts
i),(xtxt
i,ttxt
i),yi}N
i=1
to be an EHR
dataset with N patients, where
(xts
i,tts
i)
is
dm
-dimensional
MISTS,
xts
i
being observations and
tts
i
being corresponding
time points,
(xtxt
i,ttxt
i)
is a series of clinical notes with
note-taking time and
yi
is the target outcome, e.g. discharge
or death for modality prediction. In the following part, we
drop the patient index
i
for simplicity. Each dimension
of the MISTS,
(xts
j,tts
j)
, where
j= 1,· · · , dm
, has
lts
j
observations, and each patient’s
(xtxt,ttxt)
includes
ltxt
clinical notes. In early-stage medical predictions, given
(xts,tts)
and
(xtxt,ttxt)
before a certain time point (e.g.
48-hour) after admission,
α
, we seek to predict
y
for every
patient.
3.2. MISTS
3.2.1. TDE METHODS
We will describe two TDE methods to facilitate the intro-
duction of our proposed MISTS embedding approach. An
illustration is shown in Figure 3for better understanding.
Imputation. We first discretize
xts
based on
tts
, to
hourly time intervals with a sequence of regular time points,
α= [0,1,· · · , α 1]
. Then, for each feature, we use the
last observation, if multiple observations are in the same
3
摘要:

ImprovingMedicalPredictionsbyIrregularMultimodalElectronicHealthRecordsModelingXinluZhang*1ShiyangLi*1ZhiyuChen1XifengYan1LindaPetzold1AbstractHealthconditionsamongpatientsinintensivecareunits(ICUs)aremonitoredviaelectronichealthrecords(EHRs),composedofnumericaltimese-riesandlengthyclinicalnoteseque...

展开>> 收起<<
Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:658.63KB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注