Uncertainty estimation for out-of-distribution detection in computational histopathology Lea Goetz

2025-05-06 0 0 3.89MB 14 页 10玖币

侵权投诉

Uncertainty estimation for out-of-distribution

detection in computational histopathology

Lea Goetz

Artiﬁcial Intelligence and Machine Learning, GSK, London

lea.x.goetz@gsk.com

Abstract

In computational histopathology algorithms now outperform humans on a range of

tasks, but to date none are employed for automated diagnoses in the clinic. Before

algorithms can be involved in such high-stakes decisions they need to "know when

they don’t know", i.e., they need to estimate their predictive uncertainty. This

allows them to defer potentially erroneous predictions to a human pathologist,

thus increasing their safety. Here, we evaluate the predictive performance and

calibration of several uncertainty estimation methods on clinical histopathology

data. We show that a distance-aware uncertainty estimation method outperforms

commonly used approaches, such as Monte Carlo dropout and deep ensembles.

However, we observe a drop in predictive performance and calibration on novel

samples across all uncertainty estimation methods tested. We also investigate the

use of uncertainty thresholding to reject out-of-distribution samples for selective

prediction. We demonstrate the limitations of this approach and suggest areas for

future research.

1 Introduction

Over the last decade, computational histopathology has seen a surge in algorithms achieving equal

or superior performance compared to human pathologists on a diverse set of tasks, such as metastasis

detection [Liu et al., 2019], prediction of molecular markers from tissues [Naik et al., 2020], and

patient survival [Mobadersany et al., 2018]. However, despite these academic successes, none of

these models have to date been used in a decision making capacity in a clinical setting, and only one

software is approved for assisting pathologists1. What explains this translation gap? In addition

to model-independent hurdles (e.g., regulatory approval, integration into clinical workﬂows, etc.

[Steiner et al., 2021]), a key requirement for models in high-stakes applications is robustness to data

shift. However, in histopathology datasets are generally small (compared to standard ML datasets,

such as ImageNet) and models trained on them are more likely to overﬁt low-level features [Arpit

et al., 2017], such as texture [Geirhos et al., 2019], which do not generalize to novel datasets. At the

same time, large distribution shifts between training and test datasets are common in histopathology

[Zech et al., 2018, AlBadawy et al., 2018] as a result of tissue preprocessing and image acquisition

[Veta et al., 2016, Komura and Ishikawa, 2018, Tellez et al., 2019].

As building robustness against these shifts into models is difﬁcult, they need to "know when they

don’t know" [Shafaei et al., 2018, Roy et al., 2022], i.e., to estimate the uncertainty of their predic-

tions. We review relevant use cases and how methods for uncertainty estimation have been applied

in computational histopathology to date.

1https://www.fda.gov/news-events/press-announcements/fda-authorizes-software-can-

help-identify-prostate-cancer

arXiv:2210.09909v1 [cs.CV] 18 Oct 2022

1.1 Related work

There are (at least) two scenarios where a model has a high predictive uncertainty: when it encoun-

ters an unknown sample (out-of-distribution, OOD) or a known (in-distribution, ID) but ambiguous

sample. In either case, the model accuracy is likely low and samples should be deferred to a human

expert to avoid erroneous predictions. By setting an upper threshold on the uncertainty of samples,

uncertainty estimation methods can be used for selective prediction.

Uncertainty estimation methods The maximum softmax probability (MSP) [Hendrycks and

Gimpel, 2016] is a common baseline estimate of predictive uncertainty, but in general not well cal-

ibrated [Guo et al., 2017]. Bayesian neural networks [Mackay, 1992, Buntine and Weigend, 1991,

Mackay, 1995] provide a principled approach to quantify model uncertainty but require dedicated ar-

chitectures, are difﬁcult and expensive to train, hard to scale to large models and datasets, and their

uncertainty estimates may not be robust to to dataset shift [Ovadia et al., 2019, Gustafsson et al.,

2020]. There are various approximations that reduce the computational complexity, such as low

rank approximations [Dusenberry et al., 2020] and Markov chain Monte Carlo methods [Welling

and Teh, 2011], or that can be transferred to standard network architectures, such as Laplace ap-

proximations [MacKay, 1992, Daxberger et al., 2021].

Monte-Carlo (MC) dropout [Gal and Ghahramani, 2016] is a widely used method, as it is easily

implemented in architectures with DropOut layers [Hinton et al., 2012]. Currently, state-of-the-

art uncertainty estimates are obtained by using the entropy in the predictions of a deep ensemble

[Lakshminarayanan et al., 2017], or an efﬁcient approximation thereof [Wen et al., 2020]. Similarly

competitive are a number of recently proposed methods that require only a single-forward pass

and are "distance-aware" [Tagasovska and Lopez-Paz, 2019, Liu et al., 2020, Mukhoti et al., 2021,

Van Amersfoort et al., 2020, Jain et al., 2021]. They use feature space distances between training

and test samples to quantify uncertainty. This allows them to accurately estimate uncertainty far

away from the decision boundary.

Uncertainty estimation and OOD detection in histopathology Several studies in histopathology

use deep ensembles [Lakshminarayanan et al., 2017, Poceviˇ

ci¯

ut˙

e et al., 2021, Thagaard et al., 2020],

while Senousy et al. [2021b] use MSP to select models for an ensemble. Linmans et al. [2020] use

an ensembling approach on multiple prediction heads for open set recognition (OSR). R ˛aczkowski

et al. [2019] show that MC Dropout-based uncertainty is high for ambiguous or mislabelled patches,

but did not test on OOD data; Syrykh et al. [2020] use MC Dropout for both OSR and OOD detection

but do not report OOD detection metrics. Note that, MC Dropout can be problematic in network

architectures commonly used in computational histopathology 2, and has been shown to negatively

affect task performance [Linmans et al., 2020].

Unfortunately, the OOD detection reported in Syrykh et al. [2020], Poceviˇ

ci¯

ut˙

e et al. [2021],

Senousy et al. [2021a] is of limited insight, as uncertainty thresholds for selective prediction were

set on the same OOD data on which performance was evaluated. Dolezal et al. [2022] avoid such

data leakage by setting the uncertainty threshold on validation data using cross-validation. However,

it is unclear whether a threshold chosen to distinghish between correct and incorrect ID samples is

suitable to separate ID and OOD data. For example, on a dataset for which there is no correct diag-

nosis, still more than 20% of slides are rated high-conﬁdence by the uncertainty-aware classiﬁer of

Dolezal et al. [2022].

1.2 Contributions

The uncertainty estimation methods used by previous work like MC Dropout or deep ensembles

don’t show state-of-the-art performance on standard ML datasets or require substantial additional

compute, respectively. They also estimate uncertainty around the decision boundary, i.e. are most

suitable to detect ambiguous samples, but may give high conﬁdence estimates for OOD samples

far away from the decision boundary. Recently proposed "distance-aware" uncertainty estimation

2Li et al. [2019] demonstrated that applying MC Dropout with dropout rates ≥0.1 in networks that use

Layer Normalization [Ba et al., 2016] – as is the case in the related work cited above – can be problematic:

the combination causes unstable numerical behavior during inference on a range of architectures (DenseNet,

ResNet, ResNeXt [Xie et al., 2017] and Wide ResNet [Zagoruyko and Komodakis, 2016]), and requires addi-

tional implementation strategies

methods [Tagasovska and Lopez-Paz, 2019, Liu et al., 2020, Mukhoti et al., 2021, Van Amersfoort

et al., 2020] address these concerns and present an attractive alternative for histopathology. To

the best of our knowledge they have not yet been evaluated on challenging clinical datasets. Be-

cause of its superior performance and relative ease of implementation in combination with existing

architectures, we chose a spectral-normalized Gaussian Process (SNGP) [Liu et al., 2020] as a rep-

resentative distance-aware method and compare its performance to methods currently widely used

in histopathology.

Our paper makes the following contributions:

• We evaluate the predictive performance and calibration of a baseline (MSP), two commonly

used (MC Dropout and deep ensembles), and one distance-aware uncertainty estimation

method (SNGP) on CIFAR-10 and compare to two datasets of clinical histopathology data.

• We demonstrate the limitations of using uncertainty thresholding for OOD detection in

histopathology; we discuss caveats and areas for further research in applying uncertainty

estimation in histopathology.

2 Methods

2.1 Datasets

First, we evaluate the uncertainty estimation methods on CIFAR-10, to investigate whether their

performance on a standard ML dataset transfers to clinical histopathology data. As CIFAR-10 OOD

data, we designed image corruptions that emulate realistic histopathological distribution shifts. Us-

ing Pytorch’s ColorJitter transform, we randomly change the brightness, contrast, saturation and

hue based on ranges we observed in clinical whole-slide images (WSIs) stained with hematoxin and

eosin (H&E) dye (brightness=0, contrast=0, saturation=0.1, hue=0.1).

hospital 1 hospital 3

hospital 2

normal

tumor

Figure 1: Representative tiles of WSIs from the Camelyon17 dataset, illustrating the variation be-

tween normal and tumour tissue, and between different hospitals used as training/validation datasets.

While from the same hospitals, training and validation sets are non-overlapping.

Second, we use a patch-based version [Bandi et al., 2018] of the Camelyon17 grand challenge

dataset (https://camelyon17.grand-challenge.org/), which contains patches of WSIs of

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Uncertaintyestimationforout-of-distributiondetectionincomputationalhistopathologyLeaGoetzArticialIntelligenceandMachineLearning,GSK,Londonlea.x.goetz@gsk.comAbstractIncomputationalhistopathologyalgorithmsnowoutperformhumansonarangeoftasks,buttodatenoneareemployedforautomateddiagnosesintheclinic.Bef...

展开>> 收起<<

Uncertainty estimation for out-of-distribution detection in computational histopathology Lea Goetz.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Uncertainty estimation for out-of-distribution detection in computational histopathology Lea Goetz

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: