Improving the Reliability for Confidence Estimation Haoxuan Qu1 Yanchao Li1 Lin Geng Foo1 Jason Kuen2 Jiuxiang Gu2 and

2025-04-24 0 0 812.55KB 17 页 10玖币
侵权投诉
Improving the Reliability for Confidence
Estimation
Haoxuan Qu1, Yanchao Li1, Lin Geng Foo1, Jason Kuen2, Jiuxiang Gu2, and
Jun Liu1⋆⋆
1Singapore University of Technology and Design
{haoxuan qu, lingeng foo}@mymail.sutd.edu.sg, {yanchao li,
jun liu}@sutd.edu.sg
2Adobe Research
{kuen, jigu}@adobe.com
Abstract. Confidence estimation, a task that aims to evaluate the trust-
worthiness of the model’s prediction output during deployment, has re-
ceived lots of research attention recently, due to its importance for the
safe deployment of deep models. Previous works have outlined two im-
portant qualities that a reliable confidence estimation model should pos-
sess, i.e., the ability to perform well under label imbalance and the ability
to handle various out-of-distribution data inputs. In this work, we pro-
pose a meta-learning framework that can simultaneously improve upon
both qualities in a confidence estimation model. Specifically, we first con-
struct virtual training and testing sets with some intentionally designed
distribution differences between them. Our framework then uses the con-
structed sets to train the confidence estimation model through a virtual
training and testing scheme leading it to learn knowledge that general-
izes to diverse distributions. We show the effectiveness of our framework
on both monocular depth estimation and image classification.
Keywords: Confidence estimation, Meta-learning.
1 Introduction
With the continuous development of deep learning techniques, deep models are
becoming increasingly accurate on various computer vision tasks, such as image
classification [22] and monocular depth estimation [25]. However, even highly
accurate models might still commit errors [1,18,10], and these errors can po-
tentially lead to serious consequences, especially in safety-critical fields, such as
nuclear power plant monitoring [30], disease diagnosis [39], and self-driving ve-
hicles [40]. Due to the severe implications of errors in these applications, it is
crucial for us to be able to assess whether we can place confidence in the model
predictions, before acting according to them. Hence, the task of confidence esti-
mation (also known as trustworthiness prediction), which aims to evaluate the
Both authors contributed equally to the work.
⋆⋆ Corresponding Author
arXiv:2210.06776v1 [cs.CV] 13 Oct 2022
2 H. Qu et al.
confidence of the model’s prediction during deployment, has received a lot of
research attention recently [19,5,32].
Specifically, in confidence estimation, we would like to compute the confi-
dence estimate S∈ {0,1}for a prediction Pmade by a model regarding input
I, where Sestimates if prediction Pis correct (1) or not (0). In this paper,
for clarity, the task model refers to the deep model that produces predictions
Pon the main task; confidence estimation for Pis performed by a separate
confidence estimation model, which we refer to as the confidence estimator, as
shown in Fig. 1. Many previous works [5,32,26,47] have proposed to train such
a confidence estimator to conduct confidence estimation more reliably.
Fig. 1. Illustration of confidence estimation.
Some recent works [32,26]
have noted that a reliable con-
fidence estimator should per-
form well under label imbal-
ance. This is because confi-
dence estimators use the cor-
rectness of task model predic-
tions (C) as labels, which are
often imbalanced. As shown in Fig. 1, correctness labels Care produced by check-
ing for consistency between predictions Pand ground truths G, where C= 1 if
Pis correct and C= 0 otherwise. Thus, since many task models have achieved
good performance on computer vision tasks (e.g., Small ConvNet [19] achieves
>99% for MNIST [24] and VGG-16 [41,31] achieves >93% for CIFAR-10 [21]
in image classification), there are often many more correct predictions (where
C= 1) than incorrect ones (where C= 0), which leads to label imbalance for
the confidence estimation task. If this label imbalance is not accounted for dur-
ing training, the confidence estimator is likely to be overly confident [32,26] for
incorrect predictions (where C= 0), which is undesirable.
On the other hand, some other works [35,43] have suggested that the ability
to handle out-of-distribution data inputs (I) is important for confidence estima-
tion. Out-of-distribution data occurs due to distribution shifts in the data – such
data distribution shifts can occur within the same dataset [33], but are generally
more severe between different datasets, e.g. between the training data from exist-
ing datasets and testing data received during deployment under real-world con-
ditions. If the confidence estimator does not learn to handle out-of-distribution
inputs, it will tend to perform badly whenever an out-of-distribution input sam-
ple Iis fed into the task model, which affects its utility in practical applications.
In this paper, we aim to improve the reliability of our confidence estimator
in terms of both the above-mentioned qualities; we improve its ability to tackle
label imbalance of C, as well as to handle various out-of-distribution inputs I.
Specifically, we observe that these qualities actually share a common point –
they are acquired when the confidence estimator learns to generalize to diverse
distributions. If a confidence estimator learns knowledge that can generalize to
diverse distributions, it will be able to tackle diverse correctness label (C) dis-
tributions, which includes distributions where C= 0 is more common, and can
Improving the Reliability for Confidence Estimation 3
thus better tackle the imbalanced label problem; it will also be able to tackle di-
verse input (I) distributions, which improves performance on out-of-distribution
data. Based on this novel perspective, we propose to improve upon both of these
qualities simultaneously through a unified framework, that allows the confidence
estimator to learn to generalize, and perform well on distributions that might be
different from the distributions (of both Cand I) seen during training. In order
to achieve this, we incorporate meta-learning into our framework.
Meta-learning, also known as “learning to learn”, allows us to train a model
that can generalize well to different distributions. Specifically, in some meta-
learning works [9,28,13,2,16,46], a virtual testing set is used to mimic the
testing conditions during training, so that even though training is mainly done
on a virtual training set consisting of training data, performance on the testing
scenario is improved. In our work, we construct our virtual testing sets such that
they simulate various distributions that are different from the virtual training
set, which will push our model to learn distribution-generalizable knowledge to
perform well on diverse distributions, instead of learning distribution-specific
knowledge that only performs well on the training distribution. In particular, for
our confidence estimator to learn distribution-generalizable knowledge and tackle
diverse distributions of Cand I, we intentionally construct virtual training and
testing sets that simulate the different distribution shifts of Cand I, and use
them for meta-learning.
The contributions of our work are summarized as follows. 1) We propose a
novel framework, which incorporates meta-learning to learn a confidence estima-
tor to produce confidence estimates more reliably. 2) By carefully constructing
virtual training and testing sets that simulate the training and various testing
scenarios, our framework can learn to generalize well to different correctness
label distributions and input distributions. 3) We apply our framework upon
state-of-the-art confidence estimation methods [5,47] across various computer
vision tasks, including image classification and monocular depth estimation, and
achieve consistent performance enhancement throughout.
2 Related Work
Confidence Estimation. Being an important task that helps determine whether
a deep predictor’s predictions can be trusted, confidence estimation has been
studied extensively across various computer vision tasks [14,11,19,5,34,36,
32,44,4,35,26,43,47]. At the beginning, Hendrycks and Gimpel [14] proposed
Maximum Class Probability utilizing the classifier softmax distribution, Gal and
Ghahramani [11] proposed MCDropout from the perspective of uncertainty es-
timation, and Jiang et al. [19] proposed Trust Score to calculate the agreement
between the classifier and a modified nearest-neighbor classifier in the testing
set. More recently, the idea of separate confidence estimator was introduced by
several works [5,47]. Specifically, these works proposed to fix the task model,
and instead conduct confidence estimation via a separate confidence estimator.
Notably, Corbiere et al. [5] proposed a separate confidence estimator called Con-
4 H. Qu et al.
fidnet and a new loss function called True Class Probability. Subsequently, Yu et
al. [47] proposed SLURP, a generic confidence estimator for regression tasks, that
is specially targeted at task models that perform monocular depth estimation.
In this paper, we also build a separate confidence estimator, since it has the
benefit of not affecting the main task performance. Different from previous works,
we propose a novel meta-learning framework that simultaneously improves the
performance of the confidence estimator under label imbalance and on out-of-
distribution input data, in a unified manner.
Label Imbalance in Confidence Estimation. Recently, using the the correct-
ness of task model predictions (C) as labels, many existing confidence estimation
methods [14,11,5] have been shown to suffer from the label imbalance prob-
lem. To solve this problem and enable the confidence estimator to perform well
under label imbalance, various methods have been proposed. Luo et al. [32] pro-
posed a loss function called Steep Slope Loss to separate features w.r.t. correct
and incorrect task model predictions from each other. Afterwards, Li et al. [26]
proposed an extension to True Class Probability [5] that uses a Distributional
Focal Loss to focus more on predictions with higher uncertainty. Unlike previous
methods that design strategies to handle a specific imbalanced distribution of
correct and incorrect labels, we adopt a novel perspective, and tackle the label
imbalance problem through meta-learning, which allows our confidence estima-
tor to learn distribution-generalizable knowledge to tackle a variety of diverse
label distributions. This is done through construction of virtual testing sets that
simulate various different label distributions.
Confidence Estimation on Out-of-distribution Data. As various distribu-
tion shifts exist between the training and testing data in real-world applications,
the handling of out-of-distribution data inputs (I) is important for reliable con-
fidence estimation. To this end, Mukhoti et al. [35] proposed to replace the cross
entropy loss with the focal loss [29], and utilize its implicit regularization ef-
fects to handle out-of-distribution data. Tomani et al. [43] proposed to handle
out-of-distribution data via applying perturbations on data from the validation
set. However, as these previous methods either emphasizes on the rare samples
or fine-tunes on an additional set of samples, they can still be prone to overfit
these rare samples or the additional set of samples. Differently, in this work, we
propose to use meta-learning and optimize the model through feedbacks from
diverse virtual sets with diverse distributions. Thus, we can enable our model to
learn knowledge that is more generalizable to various out-of-distribution data.
Meta-learning. MAML [9], a popular meta-learning method, was originally
designed to learn a good weight initialization that can quickly adapt to new tasks
in testing, which showed promise in few-shot learning. Subsequently, its extension
[28], which requires no model updating on the unseen testing scenarios, has been
applied beyond few-shot learning, to enhance model performance [13,2,16,46].
Differently, we propose a novel framework via meta-learning to perform more
reliable confidence estimation. Through performing meta-learning on carefully
constructed virtual training and virtual testing sets, we simultaneously improve
摘要:

ImprovingtheReliabilityforConfidenceEstimationHaoxuanQu1⋆,YanchaoLi1⋆,LinGengFoo1,JasonKuen2,JiuxiangGu2,andJunLiu1⋆⋆1SingaporeUniversityofTechnologyandDesign{haoxuanqu,lingengfoo}@mymail.sutd.edu.sg,{yanchaoli,junliu}@sutd.edu.sg2AdobeResearch{kuen,jigu}@adobe.comAbstract.Confidenceestimation,atask...

展开>> 收起<<
Improving the Reliability for Confidence Estimation Haoxuan Qu1 Yanchao Li1 Lin Geng Foo1 Jason Kuen2 Jiuxiang Gu2 and.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:812.55KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注