OpenAUC Towards AUC-Oriented Open-Set Recognition Zitai Wang12Qianqian Xu3Zhiyong Yang4

2025-05-02 0 0 1.52MB 20 页 10玖币
侵权投诉
OpenAUC: Towards AUC-Oriented
Open-Set Recognition
Zitai Wang1,2Qianqian Xu3Zhiyong Yang4
Yuan He5Xiaochun Cao6,1Qingming Huang4,3,7,8
1SKLOIS, Institute of Information Engineering, CAS
2School of Cyber Security, University of Chinese Academy of Sciences
3Key Lab. of Intelligent Information Processing, Institute of Computing Tech., CAS
4School of Computer Science and Tech., University of Chinese Academy of Sciences
5Alibaba Group
6School of Cyber Science and Tech., Shenzhen Campus, Sun Yat-sen University
7BDKM, University of Chinese Academy of Sciences
8Peng Cheng Laboratory
wangzitai@iie.ac.cn xuqianqian@ict.ac.cn
yangzhiyong21@ucas.ac.cn heyuan.hy@alibaba-inc.com
caoxiaochun@mail.sysu.edu.cn qmhuang@ucas.ac.cn
Abstract
Traditional machine learning follows a close-set assumption that the training and
test set share the same label space. While in many practical scenarios, it is inevitable
that some test samples belong to unknown classes (open-set). To fix this issue,
Open-Set Recognition (OSR), whose goal is to make correct predictions on both
close-set samples and open-set samples, has attracted rising attention. In this
direction, the vast majority of literature focuses on the pattern of open-set samples.
However, how to evaluate model performance in this challenging task is still
unsolved. In this paper, a systematic analysis reveals that most existing metrics
are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics
extended from close-set classification, such as Open-set F-score, Youden’s index,
and Normalized Accuracy, a poor open-set prediction can escape from a low
performance score with a superior close-set prediction. (2) Novelty detection
AUC, which measures the ranking performance between close-set and open-set
samples, ignores the close-set performance. To fix these issues, we propose a novel
metric named OpenAUC. Compared with existing metrics, OpenAUC enjoys a
concise pairwise formulation that evaluates open-set performance and close-set
performance in a coupling manner. Further analysis shows that OpenAUC is
free from the aforementioned inconsistency properties. Finally, an end-to-end
learning method is proposed to minimize the OpenAUC risk, and the experimental
results on popular benchmark datasets speak to its effectiveness. Project Page:
https://github.com/wang22ti/OpenAUC.
1 Introduction
Traditional classification algorithms have achieved tremendous success under the close-set assumption
that all the test classes are known during the training period. However, in many practical scenarios,
it is inevitable that some test samples belong to none of the known classes. In this case, a close-set
model will classify all the novel samples into known classes, inducing a significant performance
degeneration. To fix this issue, Open-Set Recognition (OSR) has attracted rising attention in recent
years [
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
9
,
10
,
11
,
12
,
13
,
14
], where the model is required to not only (1) correctly
Corresponding authors.
36th Conference on Neural Information Processing Systems (NeurIPS 2022).
arXiv:2210.13458v3 [cs.LG] 22 Feb 2023
classify the close-set samples but also (2) discriminate the open-set samples from the close-set ones.
In this complicated setting, how to evaluate model performance becomes a challenging problem.
Existing work has proposed several metrics, which fall into two categories:
The first direction extends traditional classification metrics to the open-set scenario. To this end,
one should first extend the close-set confusion matrix with unknown classes, where a threshold
decides whether the input sample belongs to the unknown classes. On top of this,
open-set F-score
[
2
,
9
,
11
,
12
,
15
,
16
] summarizes the True Positive (TP), False Positive (FP), and False Negative
(FN) performance of known classes.
Youden’s index
[
17
] takes the sum of the True Positive Rate
(TPR) and True Negative Rate (TNR) performance of known classes as the performance measure.
Besides,
Normalized Accuracy
[
15
] summarizes the close-set accuracy and the open-set accuracy
via a convex combination. Although it is intuitive to extend close-set metrics, we point out that
these metrics are essentially inconsistent with the goal of OSR. Specifically, for open-set F-score and
Youden’s index, only the FP/FN performances of known classes evaluate the open-set performance
implicitly. As a result, these metrics will encourage classifying close-set samples into the open-set
to decrease the FN of known classes. Moreover, Normalized Accuracy encourages selecting the
threshold classifying more open-set samples into known classes. In extreme cases, even a close-set
model (i.e., all the open-set samples are classified into known classes) can obtain a high performance
on these metrics.
The second category regards OSR as a novelty detection problem [
18
,
19
] with multiple known
classes. Based on such observation, the Area Under ROC Curve (
AUC
) [
20
,
21
], which measures
the ranking performance between known classes and unknown classes, has become a popular metric
[
3
,
4
,
5
,
6
,
8
,
10
]. Compared with classification-based metrics, AUC is insensitive to the selection of
threshold since it summarizes the True Positive Rate (TPR) performance for all possible thresholds.
However, the limitation of AUC is also obvious: the close-set performance is ignored. A natural
remedy is to adopt the close-set accuracy as a complementary metric [
3
]. However, what we expect is
a model that can make correct predictions on close-set and open-set simultaneously. This decoupling
strategy will induce a challenging multi-objective optimization problem and is also unfavorable to
comparing the overall performances of different models. What’s more, simply aggregating these two
metrics will induce another inconsistency property.
In view of this, a natural question arises:
Whether there exists a numeric metric that is consistent with the goal of OSR?
To answer this question, we propose a novel metric named
OpenAUC
. Specifically, the proposed
metric enjoys a concise pairwise formulation, where each pair consists of a close-set sample and an
open-set sample. For each pair, only if the close-set sample has been classified into the correct known
class, OpenAUC will check whether the open-set sample is ranked higher than the close-set one. In
this sense, OpenAUC evaluates the close-set performance and the open-set performance in a coupling
manner, which is consistent with the goal of OSR. What’s more, benefiting from the ranking operator,
OpenAUC overcomes the sensitivity of the threshold, and further analysis shows that maximizing
OpenAUC will guarantee a better open-set performance under a mild assumption on the threshold.
Considering these advantages, we further establish an end-to-end learning method to maximize
OpenAUC. Finally, extensive experiments conducted on multiple benchmark datasets validate the
proposed metric and learning method. To sum up, the contribution of this paper is three-fold:
We make a detailed analysis of existing metrics for OSR. The theoretical results show
that existing metrics, including the classification-based ones and AUC, are essentially
inconsistent with the goal of OSR due to their own limitations.
A novel metric, named OpenAUC, is proposed. Benefiting from its concise formulation,
further analysis shows that OpenAUC overcomes the limitations of existing metrics and thus
is free from the inconsistency properties.
An end-to-end learning method is proposed to optimize OpenAUC, and the empirical results
on multiple benchmark datasets validate its effectiveness.
2
Table 1: The consistency analysis of existing metrics for OSR.
Metric P1 (close) P2 (open) P3 (threshold) P4 (numeric)
Open-set F-score [15]X× × X
Youden’s index [17]X× × X
Normalized Accuracy [15]X X ×X
AUC [3]×X X X
The OSCR curve [4]X X X ×
OpenAUC (Ours) X X X X
2 Preliminary
Problem definition.
In open-set recognition, the training samples
{zi= (xi, yi)}n
i=1
are drawn
from a product space
Zk=X × Yk
, where
X
is the input space, and
Yk={1,· · · , C}
is the
label space of known classes. During the test period, some samples might belong to none of the
known classes. For the sake of simplicity, all these samples can be allocated to one super unknown
class. In other words, the open-set samples are drawn from a product space
Zu=X × Yu
, where
Yu={C+ 1}
is the label space of unknown classes. To make predictions, OSR first requires a
rejector
R=g1r
to judge whether an input sample comes from open-set, where
r:X R
is
the open-set score function, and
g1:R→ {0,1}
is the open-set decision function. To be specific,
an input
x
will be classified as an open-set sample if
r(x)
is greater than a given threshold
tR
.
For the samples with
R(x)=0
, a classifier
h=g2f
is further required to make predictions on
known classes, where
f:X RC
is the close-set score function and
g2:RC→ Yk
is the close-set
decision function. In view of this, a proper metric for OSR should enjoy the following properties:
(P1)
For close-set samples, the metric not only evaluates whether the open-set score function
r
outputs low open-set scores but also requires that the classifier
h
make correct predictions.
(P2)
For open-set samples, the metric should check whether the open-set score function
r
outputs high open-set scores.
(P3)
The metric should be insensitive to the threshold
t
because different ratios of open-set
samples will induce different optimal thresholds, but such a ratio is unavailable during the
training period.
(P4)
The metric should be a single numeric number to favor comparing the overall perfor-
mances of different models.
Roadmap.
Next, we first present a detailed analysis of existing metrics in Sec.3. The results
show that these metrics are essentially inconsistent with the aforementioned properties, which are
summarized in Tab.1. Furthermore, a novel metric named OpenAUC and its end-to-end learning
method is proposed in Sec.4to overcome the inconsistency of existing metrics.
3 Existing metrics for Open-set Recognition
Existing metrics for OSR fall into two categories: the classification-based ones and the novelty-
detection ones. The first category extends existing classification metrics to the open-set scenario,
while the second one regards OSR as a generalized novelty detection problem. We will present a
detailed analysis of these metrics in the rest of this section.
3.1 Open-set F-score and Youden’s Index
To extend classification metrics, one should first extend the confusion matrix with the unknown class.
Let
TPi,TNi,FPi,FNi
denote the True Positive (TP), True Negative (TN), False Positive (FP), False
Negative (FN) of the class
i∈ Yk∪ Yu
under the given threshold
t
, respectively. Note that we omit
the classifier hand the rejector Rsince there exists no ambiguity.
Open-set F-score [
15
] is a representative classification-based metric for OSR. Compared with its
close-set counterpart, this metric evaluates the open-set performance via
FPi
and
FNi
, where
i∈ Yk
.
3
To be specific, this metric summarizes
TPi,FPi,FNi
of known classes by the harmonic mean of
Precision and TPR (i.e., Recall):
F-score := 2 ×Pk×TPRk
Pk+TPRk
,(1)
where
Pk:= 1
C
C
X
i=1
TPi
TPi+FPi
,TPRk:= 1
C
C
X
i=1
TPi
TPi+FNi
(2)
if one aggregates model performances in a macro manner, and
Pk:= PC
i=1 TPi
PC
i=1 (TPi+FPi),TPRk:= PC
i=1 TPi
PC
i=1 (TPi+FNi)(3)
when model performances are summarized in a micro manner. Compared with open-set F-score,
Youden’s index additionally considers TNi, where i∈ Yk[17,22]:
J:= TPRk+TNRk1,(4)
where
TNRk
denotes the TNR of known classes. However, as illustrated in Prop.1, these two metrics
suffer from an inconsistency property. Please refer to Appendix.B.1 for the proof.
Proposition 1
(Inconsistency Property I)
.
Given a dataset
S
and a metric
M
that is invariant to
TPC+1
,
FNC+1
and
FPC+1
, then for any
(h, R)
such that
PC
i=1 FPi(h, R)TPC+1(h, R)
, there
exists (˜
h, ˜
R)such that M(˜
h, ˜
R) = M(h, R)but TPC+1(˜
h, ˜
R) = 0.
Remark 1.
If a metric
M
suffers from the inconsistency property I, then for any
(h, R)
, we can
construct
(˜
h, ˜
R)
that performs as well as
(h, R)
on
M
but actually misclassifies all the open-set
samples as known classes, which is inconsistent with (P2).
Remark 2. PC
i=1 FPi(h, R)TPC+1(h, R)
is a mild condition. To be specific, when
TPC+1
is
O(C)
, it only requires that
FPi
is
O(1)
for any
i∈ Yk
. What’s more, even if this condition does not
hold, we still have TPC+1(˜
h, ˜
R)<TPC+1(h, R)as long as PC
i=1 FPi(h, R)6= 0.
Corollary 1. Open-set F-score and Youden’s index both suffer from the inconsistency property I.
3.2 Normalized Accuracy
Normalized Accuracy (NAcc) [
15
] summaries the accuracy performances on close-set and open-set:
NAcc := λnaAKS + (1 λna)AUS,(5)
where λna (0,1) is the balance constant, and
AKS := PC
i=1 [TPi+TNi]
PC
i=1 [TPi+TNi+FPi+FNi],AUS := TPC+1
TPC+1 +FPC+1
(6)
are the Accuracy on Known and Unknown Samples (AKS, AUS), respectively. Since the close-set
performance is explicitly involved, NAcc avoids the inconsistency property I. Ideally, if
λna =
P[y=C+ 1]
, NAcc becomes exactly the close-set accuracy. However, it is generally hard to decide
the balance constant
λna
since we have no idea about the ratio of open-set samples in the test set.
Besides, as shown in Prop.2, this metric suffers from another type of inconsistency property. Please
refer to Appendix.B.2 for the proof.
Proposition 2
(Inconsistency Property II)
.
Given a dataset
S
, for any classifier-rejector pair
(h, R)
such that
PC
i=1 FNi(h, R)TPC+1(h, R)
and
TPC+1(h, R)>FPC+1(h, R)
, there exists
(˜
h, ˜
R)
such that NAcc(˜
h, ˜
R)>NAcc(h, R)but TPC+1(˜
h, ˜
R) = 0.
Remark 3.
For any
(h, R)
, we can construct
(˜
h, ˜
R)
such that
NAcc(˜
h, ˜
R)>NAcc(h, R)
but actually
misclassifies all the open-set samples to known classes. In other words, NAcc encourages selecting a
threshold that classifies more open-set samples to known classes, which is inconsistent with (P3).
Remark 4.
Similar to the condition in Prop.1,
PC
i=1 FNi(h, R)TPC+1(h, R)
is a mild condition.
And TPC+1(h, R)>FPC+1(h, R)is also mild since it is a basic requirement for open-set models.
4
摘要:

OpenAUC:TowardsAUC-OrientedOpen-SetRecognitionZitaiWang1;2QianqianXu3ZhiyongYang4YuanHe5XiaochunCao6;1QingmingHuang4;3;7;81SKLOIS,InstituteofInformationEngineering,CAS2SchoolofCyberSecurity,UniversityofChineseAcademyofSciences3KeyLab.ofIntelligentInformationProcessing,InstituteofComputingTech.,CAS...

展开>> 收起<<
OpenAUC Towards AUC-Oriented Open-Set Recognition Zitai Wang12Qianqian Xu3Zhiyong Yang4.pdf

共20页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:20 页 大小:1.52MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 20
客服
关注