CHI ’21, May 08–09, 2021, Yokohama, Japan Harrigan and Morgenshtern, et al.
ACM Reference Format:
Caitlin F. Harrigan, Gabriela Morgenshtern, Anna Goldenberg, and Fanny Chevalier. 2021. Considerations for Visualizing Uncertainty
in Clinical Machine Learning Models. In Proceedings of CHI ’21 Workshop: Realizing AI in Healthcare: Challenges Appearing in the Wild
(CHI ’21). ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/nnnnnnn
1 INTRODUCTION
Supporting clinical care through integrating predictive machine learning (ML) into clinician workows has potential
to improve the standard of care for patients. A predictive model is one that uses statistical approaches to generate
predictions of unseen outcomes [
11
]. Regardless of how robust they are, these models have uncertainty, which hinders
adoption due to lack of trust [
2
]. In this work, we investigate what design considerations are perceived to most impact
trust and clinical actionability when communicating predictive uncertainty, through a qualitative study.
Clinicians in the critical care unit are adept at establishing a holistic picture of patient state by mentally integrating
bedside data with information derived from physical exams, patient histories, and lab results. Critical care is a particular
setting, because in it the clinicians consume raw features alongside model output. A model’s output is just one more
data point whose uncertainty the clinician must account for.
ML models have two main types of uncertainty: noise in the data, and systematic uncertainty in the model. A
deployed model must, additionally, deal with missing data, which may be missing at random, or (much more likely)
missing because of some clinical complication. Accounting for uncertainty in measures and predictions is a key part of
clinical reasoning on the part of the healthcare team [
8
]. While there exists literature on visualizing uncertainty [
5
],
how such approaches, or what characteristics of uncertainty may aect trust and actionability in clinical practice, is
poorly understood. This work aims to ll that gap. We conducted interviews with 5 clinicians to understand: 1) how
clinicians’ perception of uncertainty impacts trust and actionability; 2) what barriers exist in making ML predictions
amenable to clinical inference; 3) how these insights can inform visualization design. We take a model of cardiac arrest
as a case study, but aspects of our ndings may be generalizable to visualizations in other patient care environments.
We dene a clinically
actionable
visual as one which has the potential to inform clinical decision making. For
example, increasing the frequency of patient bedside or remote monitoring.
Trust
is the level of perceived credibility
attributed to a visualization. In ML literature, the degree of trustworthiness in a model results is strongly related to
its interpretability [
4
]. Our clinician interviews suggest that trust and visualization actionability are most positively
impacted when design prioritizes transparent communication around missing data and the overall prediction trend.
2 BACKGROUND
2.1 Related Work
Our work is similar to that of Jeery et al. [
6
], who employ participatory design strategies to explore nurses’ preferences
for the display of a predictive model of cardiac arrest. Findings for desired visualization elements are closely aligned
with our own ndings, and included a temporal trendline of predicted cardiopulmonary arrest probability with an
overlapping view of relevant lab values, vital signs, treatments, interventions, and a patient baseline. However, Jeery
et al. do not investigate the implications of displaying uncertainty alongside predicted values.
Hullman’s [
5
] review of uncertainty visualization user studies reveals that in most evaluations concerning inter-
pretation of uncertainty, there exists a bias in the instrumentation towards evaluating accuracy, rather than decision
quality. Thus, we follow recommendations that evaluators focus on collecting participant feedback on how judgement
is made, and what information they found helpful in making it [5].
2