UniNL Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning Yutao Mou1 Pei Wang1 Keqing He2 Yanan Wu1

2025-05-06 0 0 470.86KB 9 页 10玖币
侵权投诉
UniNL: Aligning Representation Learning with Scoring Function for
OOD Detection via Unified Neighborhood Learning
Yutao Mou1, Pei Wang1, Keqing He2, Yanan Wu1
Jingang Wang2, Wei Wu2, Weiran Xu1
1Beijing University of Posts and Telecommunications, Beijing, China
2Meituan, Beijing, China
{myt,wangpei,yanan.wu,xuweiran}@bupt.edu.cn
{hekeqing,wangjingang,wuwei}@meituan.com
Abstract
Detecting out-of-domain (OOD) intents from
user queries is essential for avoiding wrong
operations in task-oriented dialogue systems.
The key challenge is how to distinguish in-
domain (IND) and OOD intents. Previous
methods ignore the alignment between repre-
sentation learning and scoring function, limit-
ing the OOD detection performance. In this pa-
per, we propose a unified neighborhood learn-
ing framework (UniNL) to detect OOD in-
tents. Specifically, we design a K-nearest
neighbor contrastive learning (KNCL) objec-
tive for representation learning and introduce
a KNN-based scoring function for OOD detec-
tion. We aim to align representation learning
with scoring function. Experiments and analy-
sis on two benchmark datasets show the effec-
tiveness of our method. 1
1 Introduction
Out-of-domain (OOD) intent detection aims to
know when a user query falls outside the range
of pre-defined supported intents, which helps to
avoid performing wrong operations and provide
potential directions of future development in a task-
oriented dialogue system (Akasaki and Kaji,2017;
Tulshan and Dhage,2018;Shum et al.,2018;Lin
and Xu,2019;Xu et al.,2020;Zeng et al.,2021a,b).
Compared with normal intent detection tasks, we
don’t know the exact number and lack labeled data
for unknown intents, which makes it challenging to
identify OOD samples in the task-oriented dialog.
Previous OOD detection works can be generally
classified into two types: supervised (Fei and Liu,
2016;Kim and Kim,2018;Larson et al.,2019;
Zheng et al.,2020) and unsupervised (Bendale and
Boult,2016;Hendrycks and Gimpel,2017;Shu
et al.,2017;Lee et al.,2018;Ren et al.,2019;
The first three authors contribute equally. Weiran Xu is
the corresponding author.
1
We release our code at
https://github.com/
Yupei-Wang/UniNL
Lin and Xu,2019;Xu et al.,2020;Zeng et al.,
2021a) OOD detection. The former indicates that
there are extensive labeled OOD samples in the
training data. Fei and Liu (2016); Larson et al.
(2019), form a (N+1)-class classification problem
where the (N+1)-th class represents the OOD in-
tents. Further, Zheng et al. (2020) uses labeled
OOD data to generate an entropy regularization
term. But these methods require numerous labeled
OOD intents to get superior performance, which is
unrealistic. We focus on the unsupervised OOD de-
tection setting where labeled OOD samples are not
available for training. Unsupervised OOD detec-
tion first learns discriminative representations only
using labeled IND data and then employs scoring
functions, such as Maximum Softmax Probability
(MSP) (Hendrycks and Gimpel,2017), Local Out-
lier Factor (LOF) (Lin and Xu,2019), Gaussian
Discriminant Analysis (GDA) (Xu et al.,2020) to
estimate the confidence score of a test query.
All these unsupervised OOD detection methods
only focus on the improvement of a single aspect
of representation learning or scoring function, but
none of them consider how to align representa-
tion learning with scoring functions. For example,
Lin and Xu (2019) proposes a local outlier fac-
tor for OOD detection, which considers the local
density of a test query to determine whether it be-
longs to an OOD intent, but the IND pre-training
objective LMCL (Wang et al.,2018) cannot learn
neighborhood discriminative representations. Xu
et al. (2020); Zeng et al. (2021a) employ a gaussian
discriminant analysis method for OOD detection,
which assumes that the IND cluster distribution is a
gaussian distribution, but they use a cross-entropy
or supervised contrastive learning (Khosla et al.,
2020) objective for representation learning, which
cannot guarantee that such an assumption is satis-
fied. The gap between representation learning and
scoring function limits the overall performance of
these methods.
arXiv:2210.10722v1 [cs.CL] 19 Oct 2022
To solve the conflict, in this paper, we pro-
pose a
Uni
fied
N
eighborhood
L
earning framework
(
UniNL
) for OOD detection, which aims to align
IND pre-training representation objectives with
OOD scoring functions. Our intuition is to learn
neighborhood knowledge (Breunig et al.,2000a) to
detect OOD intents. For IND pre-training, we intro-
duce a K-Nearest Neighbor Contrastive Learning
Objective (KNCL) to learn neighborhood discrim-
inative representations. Compared to SCL (Zeng
et al.,2021a) which draws all samples of the same
class closer, KNCL only pulls together similar sam-
ples in the neighbors. To align KNCL, we fur-
ther propose a K-nearest neighbor scoring function,
which estimates the test sample confidence score
by computing the average distance between a test
sample and its K-nearest neighbor samples. The
KNCL objective learns neighborhood discrimina-
tive knowledge, which is more beneficial to pro-
moting KNN-based scoring functions.
Our contributions are three-fold: (1) We pro-
pose a unified neighborhood learning framework
(UniNL) for OOD detection, which aims to match
IND pre-training objectives with OOD scoring
functions. (2) We propose a K-nearest neighbor
contrastive learning (KNCL) objective for IND
pre-training to learn neighborhood discriminative
knowledge, and a KNN-based scoring function to
detect OOD intents. (3) Experiments and analysis
demonstrate the effectiveness of our method for
OOD detection.
2 Approach
Overall Architecture
Fig 1shows the overall ar-
chitecture of our proposed UniNL, which includes
K-nearest contrastive learning (KNCL) and KNN-
based score function. We first train an in-domain
intent classifier using our KNCL objective in the
training stage, which aims to learn neighborhood
discriminative representation. Then in the test
stage, we extract the intent feature of a test query
and employ our proposed KNN-based score func-
tion to estimate the confidence score. We aim to
align representation learning and scoring functions.
KNN Contrastive Representation Learning
Existing OOD detection methods generally adopt
cross-entropy (CE) (Xu et al.,2020) and supervised
contrastive learning (SCL) (Zeng et al.,2021a) ob-
jectives for representation learning. Both CE and
SCL tend to bring all samples of the same classes
closer, and samples of different classes are pushed
Embedding Layer
In-domain Utterances
Contextual Encoder
CE Loss KNCL Loss
IND
K=5
K=5
OOD Pull Together
Push Away
Threshold
KNN-based
Scoring Function
KNN Contrastive Loss
Figure 1: Overall architecture of UniNL.
away. They learn inter-class discriminative fea-
tures in a global representation space. However,
we find that when performing OOD detection, we
care more about the data distribution within the
neighborhood of a given sample. Inspired by Bre-
unig et al. (2000b), we hope to learn neighborhood
discriminative knowledge in the IND pre-training
stage to facilitate OOD detection. We propose a K-
nearest neighborhood contrastive learning (KNCL)
objective to learn discriminative features in a lo-
cal representation space. Given an IND sample
xi
,
we firstly obtain its intent representation
zi
using
a BiLSTM (Hochreiter and Schmidhuber,1997)
or BERT (Devlin et al.,2019) encoder. Next, we
perform KNCL as follows:
LKN CL =
N
X
i=1
1
|Nk(i)|
Nk(i)
X
j=1
1i6=j1yi=yj
log exp (zi·zj)
PNk(i)
k=1 1i6=kexp (zi·zk)
(1)
where
Nk(i)
is the KNN set of
zi
in the represen-
tation space. KNCL only draws closer together
samples of the same class in the neighborhood.
Specifically, given an anchor, KNCL first finds its
KNN set in a batch, and then selects samples of
the same class as positives, and different classes as
negatives. Similar to Zeng et al. (2021a), we use
an adversarial augmentation strategy to generate
augmented views of the original samples within
a batch. In the implementation, we first pre-train
the intent classifier using KNCL, then finetune the
model using CE, both on the IND data. We leave
the implementation details in the appendix. Sec-
tion 3.3 proves that KNCL learns neighborhood
discriminative knowledge and helps to distinguish
IND from OOD.
KNN-based Score Function
To align with the
KNCL representation learning objective, we pro-
摘要:

UniNL:AligningRepresentationLearningwithScoringFunctionforOODDetectionviaUniedNeighborhoodLearningYutaoMou1,PeiWang1,KeqingHe2,YananWu1JingangWang2,WeiWu2,WeiranXu11BeijingUniversityofPostsandTelecommunications,Beijing,China2Meituan,Beijing,China{myt,wangpei,yanan.wu,xuweiran}@bupt.edu.cn{hekeq...

收起<<
UniNL Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning Yutao Mou1 Pei Wang1 Keqing He2 Yanan Wu1.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:470.86KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注