Improving the Reliability for Confidence Estimation Haoxuan Qu1 Yanchao Li1 Lin Geng Foo1 Jason Kuen2 Jiuxiang Gu2 and

2025-04-24 0 0 812.55KB 17 页 10玖币

侵权投诉

Improving the Reliability for Conﬁdence

Estimation

Haoxuan Qu1⋆, Yanchao Li1⋆, Lin Geng Foo1, Jason Kuen2, Jiuxiang Gu2, and

Jun Liu1⋆⋆

1Singapore University of Technology and Design

{haoxuan qu, lingeng foo}@mymail.sutd.edu.sg, {yanchao li,

jun liu}@sutd.edu.sg

2Adobe Research

{kuen, jigu}@adobe.com

Abstract. Conﬁdence estimation, a task that aims to evaluate the trust-

worthiness of the model’s prediction output during deployment, has re-

ceived lots of research attention recently, due to its importance for the

safe deployment of deep models. Previous works have outlined two im-

portant qualities that a reliable conﬁdence estimation model should pos-

sess, i.e., the ability to perform well under label imbalance and the ability

to handle various out-of-distribution data inputs. In this work, we pro-

pose a meta-learning framework that can simultaneously improve upon

both qualities in a conﬁdence estimation model. Speciﬁcally, we ﬁrst con-

struct virtual training and testing sets with some intentionally designed

distribution diﬀerences between them. Our framework then uses the con-

structed sets to train the conﬁdence estimation model through a virtual

training and testing scheme leading it to learn knowledge that general-

izes to diverse distributions. We show the eﬀectiveness of our framework

on both monocular depth estimation and image classiﬁcation.

Keywords: Conﬁdence estimation, Meta-learning.

1 Introduction

With the continuous development of deep learning techniques, deep models are

becoming increasingly accurate on various computer vision tasks, such as image

classiﬁcation [22] and monocular depth estimation [25]. However, even highly

accurate models might still commit errors [1,18,10], and these errors can po-

tentially lead to serious consequences, especially in safety-critical ﬁelds, such as

nuclear power plant monitoring [30], disease diagnosis [39], and self-driving ve-

hicles [40]. Due to the severe implications of errors in these applications, it is

crucial for us to be able to assess whether we can place conﬁdence in the model

predictions, before acting according to them. Hence, the task of conﬁdence esti-

mation (also known as trustworthiness prediction), which aims to evaluate the

⋆Both authors contributed equally to the work.

⋆⋆ Corresponding Author

arXiv:2210.06776v1 [cs.CV] 13 Oct 2022

2 H. Qu et al.

conﬁdence of the model’s prediction during deployment, has received a lot of

research attention recently [19,5,32].

Speciﬁcally, in conﬁdence estimation, we would like to compute the conﬁ-

dence estimate S∈ {0,1}for a prediction Pmade by a model regarding input

I, where Sestimates if prediction Pis correct (1) or not (0). In this paper,

for clarity, the task model refers to the deep model that produces predictions

Pon the main task; conﬁdence estimation for Pis performed by a separate

conﬁdence estimation model, which we refer to as the conﬁdence estimator, as

shown in Fig. 1. Many previous works [5,32,26,47] have proposed to train such

a conﬁdence estimator to conduct conﬁdence estimation more reliably.

Fig. 1. Illustration of conﬁdence estimation.

Some recent works [32,26]

have noted that a reliable con-

ﬁdence estimator should per-

form well under label imbal-

ance. This is because conﬁ-

dence estimators use the cor-

rectness of task model predic-

tions (C) as labels, which are

often imbalanced. As shown in Fig. 1, correctness labels Care produced by check-

ing for consistency between predictions Pand ground truths G, where C= 1 if

Pis correct and C= 0 otherwise. Thus, since many task models have achieved

good performance on computer vision tasks (e.g., Small ConvNet [19] achieves

>99% for MNIST [24] and VGG-16 [41,31] achieves >93% for CIFAR-10 [21]

in image classiﬁcation), there are often many more correct predictions (where

C= 1) than incorrect ones (where C= 0), which leads to label imbalance for

the conﬁdence estimation task. If this label imbalance is not accounted for dur-

ing training, the conﬁdence estimator is likely to be overly conﬁdent [32,26] for

incorrect predictions (where C= 0), which is undesirable.

On the other hand, some other works [35,43] have suggested that the ability

to handle out-of-distribution data inputs (I) is important for conﬁdence estima-

tion. Out-of-distribution data occurs due to distribution shifts in the data – such

data distribution shifts can occur within the same dataset [33], but are generally

more severe between diﬀerent datasets, e.g. between the training data from exist-

ing datasets and testing data received during deployment under real-world con-

ditions. If the conﬁdence estimator does not learn to handle out-of-distribution

inputs, it will tend to perform badly whenever an out-of-distribution input sam-

ple Iis fed into the task model, which aﬀects its utility in practical applications.

In this paper, we aim to improve the reliability of our conﬁdence estimator

in terms of both the above-mentioned qualities; we improve its ability to tackle

label imbalance of C, as well as to handle various out-of-distribution inputs I.

Speciﬁcally, we observe that these qualities actually share a common point –

they are acquired when the conﬁdence estimator learns to generalize to diverse

distributions. If a conﬁdence estimator learns knowledge that can generalize to

diverse distributions, it will be able to tackle diverse correctness label (C) dis-

tributions, which includes distributions where C= 0 is more common, and can

Improving the Reliability for Conﬁdence Estimation 3

thus better tackle the imbalanced label problem; it will also be able to tackle di-

verse input (I) distributions, which improves performance on out-of-distribution

data. Based on this novel perspective, we propose to improve upon both of these

qualities simultaneously through a uniﬁed framework, that allows the conﬁdence

estimator to learn to generalize, and perform well on distributions that might be

diﬀerent from the distributions (of both Cand I) seen during training. In order

to achieve this, we incorporate meta-learning into our framework.

Meta-learning, also known as “learning to learn”, allows us to train a model

that can generalize well to diﬀerent distributions. Speciﬁcally, in some meta-

learning works [9,28,13,2,16,46], a virtual testing set is used to mimic the

testing conditions during training, so that even though training is mainly done

on a virtual training set consisting of training data, performance on the testing

scenario is improved. In our work, we construct our virtual testing sets such that

they simulate various distributions that are diﬀerent from the virtual training

set, which will push our model to learn distribution-generalizable knowledge to

perform well on diverse distributions, instead of learning distribution-speciﬁc

knowledge that only performs well on the training distribution. In particular, for

our conﬁdence estimator to learn distribution-generalizable knowledge and tackle

diverse distributions of Cand I, we intentionally construct virtual training and

testing sets that simulate the diﬀerent distribution shifts of Cand I, and use

them for meta-learning.

The contributions of our work are summarized as follows. 1) We propose a

novel framework, which incorporates meta-learning to learn a conﬁdence estima-

tor to produce conﬁdence estimates more reliably. 2) By carefully constructing

virtual training and testing sets that simulate the training and various testing

scenarios, our framework can learn to generalize well to diﬀerent correctness

label distributions and input distributions. 3) We apply our framework upon

state-of-the-art conﬁdence estimation methods [5,47] across various computer

vision tasks, including image classiﬁcation and monocular depth estimation, and

achieve consistent performance enhancement throughout.

2 Related Work

Conﬁdence Estimation. Being an important task that helps determine whether

a deep predictor’s predictions can be trusted, conﬁdence estimation has been

studied extensively across various computer vision tasks [14,11,19,5,34,36,

32,44,4,35,26,43,47]. At the beginning, Hendrycks and Gimpel [14] proposed

Maximum Class Probability utilizing the classiﬁer softmax distribution, Gal and

Ghahramani [11] proposed MCDropout from the perspective of uncertainty es-

timation, and Jiang et al. [19] proposed Trust Score to calculate the agreement

between the classiﬁer and a modiﬁed nearest-neighbor classiﬁer in the testing

set. More recently, the idea of separate conﬁdence estimator was introduced by

several works [5,47]. Speciﬁcally, these works proposed to ﬁx the task model,

and instead conduct conﬁdence estimation via a separate conﬁdence estimator.

Notably, Corbiere et al. [5] proposed a separate conﬁdence estimator called Con-

4 H. Qu et al.

ﬁdnet and a new loss function called True Class Probability. Subsequently, Yu et

al. [47] proposed SLURP, a generic conﬁdence estimator for regression tasks, that

is specially targeted at task models that perform monocular depth estimation.

In this paper, we also build a separate conﬁdence estimator, since it has the

beneﬁt of not aﬀecting the main task performance. Diﬀerent from previous works,

we propose a novel meta-learning framework that simultaneously improves the

performance of the conﬁdence estimator under label imbalance and on out-of-

distribution input data, in a uniﬁed manner.

Label Imbalance in Conﬁdence Estimation. Recently, using the the correct-

ness of task model predictions (C) as labels, many existing conﬁdence estimation

methods [14,11,5] have been shown to suﬀer from the label imbalance prob-

lem. To solve this problem and enable the conﬁdence estimator to perform well

under label imbalance, various methods have been proposed. Luo et al. [32] pro-

posed a loss function called Steep Slope Loss to separate features w.r.t. correct

and incorrect task model predictions from each other. Afterwards, Li et al. [26]

proposed an extension to True Class Probability [5] that uses a Distributional

Focal Loss to focus more on predictions with higher uncertainty. Unlike previous

methods that design strategies to handle a speciﬁc imbalanced distribution of

correct and incorrect labels, we adopt a novel perspective, and tackle the label

imbalance problem through meta-learning, which allows our conﬁdence estima-

tor to learn distribution-generalizable knowledge to tackle a variety of diverse

label distributions. This is done through construction of virtual testing sets that

simulate various diﬀerent label distributions.

Conﬁdence Estimation on Out-of-distribution Data. As various distribu-

tion shifts exist between the training and testing data in real-world applications,

the handling of out-of-distribution data inputs (I) is important for reliable con-

ﬁdence estimation. To this end, Mukhoti et al. [35] proposed to replace the cross

entropy loss with the focal loss [29], and utilize its implicit regularization ef-

fects to handle out-of-distribution data. Tomani et al. [43] proposed to handle

out-of-distribution data via applying perturbations on data from the validation

set. However, as these previous methods either emphasizes on the rare samples

or ﬁne-tunes on an additional set of samples, they can still be prone to overﬁt

these rare samples or the additional set of samples. Diﬀerently, in this work, we

propose to use meta-learning and optimize the model through feedbacks from

diverse virtual sets with diverse distributions. Thus, we can enable our model to

learn knowledge that is more generalizable to various out-of-distribution data.

Meta-learning. MAML [9], a popular meta-learning method, was originally

designed to learn a good weight initialization that can quickly adapt to new tasks

in testing, which showed promise in few-shot learning. Subsequently, its extension

[28], which requires no model updating on the unseen testing scenarios, has been

applied beyond few-shot learning, to enhance model performance [13,2,16,46].

Diﬀerently, we propose a novel framework via meta-learning to perform more

reliable conﬁdence estimation. Through performing meta-learning on carefully

constructed virtual training and virtual testing sets, we simultaneously improve

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ImprovingtheReliabilityforConfidenceEstimationHaoxuanQu1⋆,YanchaoLi1⋆,LinGengFoo1,JasonKuen2,JiuxiangGu2,andJunLiu1⋆⋆1SingaporeUniversityofTechnologyandDesign{haoxuanqu,lingengfoo}@mymail.sutd.edu.sg,{yanchaoli,junliu}@sutd.edu.sg2AdobeResearch{kuen,jigu}@adobe.comAbstract.Confidenceestimation,atask...

展开>> 收起<<

Improving the Reliability for Confidence Estimation Haoxuan Qu1 Yanchao Li1 Lin Geng Foo1 Jason Kuen2 Jiuxiang Gu2 and.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Improving the Reliability for Confidence Estimation Haoxuan Qu1 Yanchao Li1 Lin Geng Foo1 Jason Kuen2 Jiuxiang Gu2 and

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: