Uncertainty estimation for out-of-distribution detection in computational histopathology Lea Goetz

2025-05-06 0 0 3.89MB 14 页 10玖币
侵权投诉
Uncertainty estimation for out-of-distribution
detection in computational histopathology
Lea Goetz
Artificial Intelligence and Machine Learning, GSK, London
lea.x.goetz@gsk.com
Abstract
In computational histopathology algorithms now outperform humans on a range of
tasks, but to date none are employed for automated diagnoses in the clinic. Before
algorithms can be involved in such high-stakes decisions they need to "know when
they don’t know", i.e., they need to estimate their predictive uncertainty. This
allows them to defer potentially erroneous predictions to a human pathologist,
thus increasing their safety. Here, we evaluate the predictive performance and
calibration of several uncertainty estimation methods on clinical histopathology
data. We show that a distance-aware uncertainty estimation method outperforms
commonly used approaches, such as Monte Carlo dropout and deep ensembles.
However, we observe a drop in predictive performance and calibration on novel
samples across all uncertainty estimation methods tested. We also investigate the
use of uncertainty thresholding to reject out-of-distribution samples for selective
prediction. We demonstrate the limitations of this approach and suggest areas for
future research.
1 Introduction
Over the last decade, computational histopathology has seen a surge in algorithms achieving equal
or superior performance compared to human pathologists on a diverse set of tasks, such as metastasis
detection [Liu et al., 2019], prediction of molecular markers from tissues [Naik et al., 2020], and
patient survival [Mobadersany et al., 2018]. However, despite these academic successes, none of
these models have to date been used in a decision making capacity in a clinical setting, and only one
software is approved for assisting pathologists1. What explains this translation gap? In addition
to model-independent hurdles (e.g., regulatory approval, integration into clinical workflows, etc.
[Steiner et al., 2021]), a key requirement for models in high-stakes applications is robustness to data
shift. However, in histopathology datasets are generally small (compared to standard ML datasets,
such as ImageNet) and models trained on them are more likely to overfit low-level features [Arpit
et al., 2017], such as texture [Geirhos et al., 2019], which do not generalize to novel datasets. At the
same time, large distribution shifts between training and test datasets are common in histopathology
[Zech et al., 2018, AlBadawy et al., 2018] as a result of tissue preprocessing and image acquisition
[Veta et al., 2016, Komura and Ishikawa, 2018, Tellez et al., 2019].
As building robustness against these shifts into models is difficult, they need to "know when they
don’t know" [Shafaei et al., 2018, Roy et al., 2022], i.e., to estimate the uncertainty of their predic-
tions. We review relevant use cases and how methods for uncertainty estimation have been applied
in computational histopathology to date.
1https://www.fda.gov/news-events/press-announcements/fda-authorizes-software-can-
help-identify-prostate-cancer
arXiv:2210.09909v1 [cs.CV] 18 Oct 2022
1.1 Related work
There are (at least) two scenarios where a model has a high predictive uncertainty: when it encoun-
ters an unknown sample (out-of-distribution, OOD) or a known (in-distribution, ID) but ambiguous
sample. In either case, the model accuracy is likely low and samples should be deferred to a human
expert to avoid erroneous predictions. By setting an upper threshold on the uncertainty of samples,
uncertainty estimation methods can be used for selective prediction.
Uncertainty estimation methods The maximum softmax probability (MSP) [Hendrycks and
Gimpel, 2016] is a common baseline estimate of predictive uncertainty, but in general not well cal-
ibrated [Guo et al., 2017]. Bayesian neural networks [Mackay, 1992, Buntine and Weigend, 1991,
Mackay, 1995] provide a principled approach to quantify model uncertainty but require dedicated ar-
chitectures, are difficult and expensive to train, hard to scale to large models and datasets, and their
uncertainty estimates may not be robust to to dataset shift [Ovadia et al., 2019, Gustafsson et al.,
2020]. There are various approximations that reduce the computational complexity, such as low
rank approximations [Dusenberry et al., 2020] and Markov chain Monte Carlo methods [Welling
and Teh, 2011], or that can be transferred to standard network architectures, such as Laplace ap-
proximations [MacKay, 1992, Daxberger et al., 2021].
Monte-Carlo (MC) dropout [Gal and Ghahramani, 2016] is a widely used method, as it is easily
implemented in architectures with DropOut layers [Hinton et al., 2012]. Currently, state-of-the-
art uncertainty estimates are obtained by using the entropy in the predictions of a deep ensemble
[Lakshminarayanan et al., 2017], or an efficient approximation thereof [Wen et al., 2020]. Similarly
competitive are a number of recently proposed methods that require only a single-forward pass
and are "distance-aware" [Tagasovska and Lopez-Paz, 2019, Liu et al., 2020, Mukhoti et al., 2021,
Van Amersfoort et al., 2020, Jain et al., 2021]. They use feature space distances between training
and test samples to quantify uncertainty. This allows them to accurately estimate uncertainty far
away from the decision boundary.
Uncertainty estimation and OOD detection in histopathology Several studies in histopathology
use deep ensembles [Lakshminarayanan et al., 2017, Poceviˇ
ci¯
ut˙
e et al., 2021, Thagaard et al., 2020],
while Senousy et al. [2021b] use MSP to select models for an ensemble. Linmans et al. [2020] use
an ensembling approach on multiple prediction heads for open set recognition (OSR). R ˛aczkowski
et al. [2019] show that MC Dropout-based uncertainty is high for ambiguous or mislabelled patches,
but did not test on OOD data; Syrykh et al. [2020] use MC Dropout for both OSR and OOD detection
but do not report OOD detection metrics. Note that, MC Dropout can be problematic in network
architectures commonly used in computational histopathology 2, and has been shown to negatively
affect task performance [Linmans et al., 2020].
Unfortunately, the OOD detection reported in Syrykh et al. [2020], Poceviˇ
ci¯
ut˙
e et al. [2021],
Senousy et al. [2021a] is of limited insight, as uncertainty thresholds for selective prediction were
set on the same OOD data on which performance was evaluated. Dolezal et al. [2022] avoid such
data leakage by setting the uncertainty threshold on validation data using cross-validation. However,
it is unclear whether a threshold chosen to distinghish between correct and incorrect ID samples is
suitable to separate ID and OOD data. For example, on a dataset for which there is no correct diag-
nosis, still more than 20% of slides are rated high-confidence by the uncertainty-aware classifier of
Dolezal et al. [2022].
1.2 Contributions
The uncertainty estimation methods used by previous work like MC Dropout or deep ensembles
don’t show state-of-the-art performance on standard ML datasets or require substantial additional
compute, respectively. They also estimate uncertainty around the decision boundary, i.e. are most
suitable to detect ambiguous samples, but may give high confidence estimates for OOD samples
far away from the decision boundary. Recently proposed "distance-aware" uncertainty estimation
2Li et al. [2019] demonstrated that applying MC Dropout with dropout rates 0.1 in networks that use
Layer Normalization [Ba et al., 2016] – as is the case in the related work cited above – can be problematic:
the combination causes unstable numerical behavior during inference on a range of architectures (DenseNet,
ResNet, ResNeXt [Xie et al., 2017] and Wide ResNet [Zagoruyko and Komodakis, 2016]), and requires addi-
tional implementation strategies
2
methods [Tagasovska and Lopez-Paz, 2019, Liu et al., 2020, Mukhoti et al., 2021, Van Amersfoort
et al., 2020] address these concerns and present an attractive alternative for histopathology. To
the best of our knowledge they have not yet been evaluated on challenging clinical datasets. Be-
cause of its superior performance and relative ease of implementation in combination with existing
architectures, we chose a spectral-normalized Gaussian Process (SNGP) [Liu et al., 2020] as a rep-
resentative distance-aware method and compare its performance to methods currently widely used
in histopathology.
Our paper makes the following contributions:
We evaluate the predictive performance and calibration of a baseline (MSP), two commonly
used (MC Dropout and deep ensembles), and one distance-aware uncertainty estimation
method (SNGP) on CIFAR-10 and compare to two datasets of clinical histopathology data.
We demonstrate the limitations of using uncertainty thresholding for OOD detection in
histopathology; we discuss caveats and areas for further research in applying uncertainty
estimation in histopathology.
2 Methods
2.1 Datasets
First, we evaluate the uncertainty estimation methods on CIFAR-10, to investigate whether their
performance on a standard ML dataset transfers to clinical histopathology data. As CIFAR-10 OOD
data, we designed image corruptions that emulate realistic histopathological distribution shifts. Us-
ing Pytorch’s ColorJitter transform, we randomly change the brightness, contrast, saturation and
hue based on ranges we observed in clinical whole-slide images (WSIs) stained with hematoxin and
eosin (H&E) dye (brightness=0, contrast=0, saturation=0.1, hue=0.1).
hospital 1 hospital 3
hospital 2
normal
tumor
Figure 1: Representative tiles of WSIs from the Camelyon17 dataset, illustrating the variation be-
tween normal and tumour tissue, and between different hospitals used as training/validation datasets.
While from the same hospitals, training and validation sets are non-overlapping.
Second, we use a patch-based version [Bandi et al., 2018] of the Camelyon17 grand challenge
dataset (https://camelyon17.grand-challenge.org/), which contains patches of WSIs of
3
摘要:

Uncertaintyestimationforout-of-distributiondetectionincomputationalhistopathologyLeaGoetzArticialIntelligenceandMachineLearning,GSK,Londonlea.x.goetz@gsk.comAbstractIncomputationalhistopathologyalgorithmsnowoutperformhumansonarangeoftasks,buttodatenoneareemployedforautomateddiagnosesintheclinic.Bef...

展开>> 收起<<
Uncertainty estimation for out-of-distribution detection in computational histopathology Lea Goetz.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:14 页 大小:3.89MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注