UniNL Aligning Representation Learning with Scoring Function for OOD Detection via Uniﬁed Neighborhood Learning Yutao Mou1 Pei Wang1 Keqing He2 Yanan Wu1

2025-05-06 1 0 470.86KB 9 页 10玖币

侵权投诉

UniNL: Aligning Representation Learning with Scoring Function for

OOD Detection via Uniﬁed Neighborhood Learning

Yutao Mou1∗, Pei Wang1∗, Keqing He2∗, Yanan Wu1

Jingang Wang2, Wei Wu2, Weiran Xu1∗

1Beijing University of Posts and Telecommunications, Beijing, China

2Meituan, Beijing, China

{myt,wangpei,yanan.wu,xuweiran}@bupt.edu.cn

{hekeqing,wangjingang,wuwei}@meituan.com

Abstract

Detecting out-of-domain (OOD) intents from

user queries is essential for avoiding wrong

operations in task-oriented dialogue systems.

The key challenge is how to distinguish in-

domain (IND) and OOD intents. Previous

methods ignore the alignment between repre-

sentation learning and scoring function, limit-

ing the OOD detection performance. In this pa-

per, we propose a uniﬁed neighborhood learn-

ing framework (UniNL) to detect OOD in-

tents. Speciﬁcally, we design a K-nearest

neighbor contrastive learning (KNCL) objec-

tive for representation learning and introduce

a KNN-based scoring function for OOD detec-

tion. We aim to align representation learning

with scoring function. Experiments and analy-

sis on two benchmark datasets show the effec-

tiveness of our method. 1

1 Introduction

Out-of-domain (OOD) intent detection aims to

know when a user query falls outside the range

of pre-deﬁned supported intents, which helps to

avoid performing wrong operations and provide

potential directions of future development in a task-

oriented dialogue system (Akasaki and Kaji,2017;

Tulshan and Dhage,2018;Shum et al.,2018;Lin

and Xu,2019;Xu et al.,2020;Zeng et al.,2021a,b).

Compared with normal intent detection tasks, we

don’t know the exact number and lack labeled data

for unknown intents, which makes it challenging to

identify OOD samples in the task-oriented dialog.

Previous OOD detection works can be generally

classiﬁed into two types: supervised (Fei and Liu,

2016;Kim and Kim,2018;Larson et al.,2019;

Zheng et al.,2020) and unsupervised (Bendale and

Boult,2016;Hendrycks and Gimpel,2017;Shu

et al.,2017;Lee et al.,2018;Ren et al.,2019;

∗

The ﬁrst three authors contribute equally. Weiran Xu is

the corresponding author.

We release our code at

https://github.com/

Yupei-Wang/UniNL

Lin and Xu,2019;Xu et al.,2020;Zeng et al.,

2021a) OOD detection. The former indicates that

there are extensive labeled OOD samples in the

training data. Fei and Liu (2016); Larson et al.

(2019), form a (N+1)-class classiﬁcation problem

where the (N+1)-th class represents the OOD in-

tents. Further, Zheng et al. (2020) uses labeled

OOD data to generate an entropy regularization

term. But these methods require numerous labeled

OOD intents to get superior performance, which is

unrealistic. We focus on the unsupervised OOD de-

tection setting where labeled OOD samples are not

available for training. Unsupervised OOD detec-

tion ﬁrst learns discriminative representations only

using labeled IND data and then employs scoring

functions, such as Maximum Softmax Probability

(MSP) (Hendrycks and Gimpel,2017), Local Out-

lier Factor (LOF) (Lin and Xu,2019), Gaussian

Discriminant Analysis (GDA) (Xu et al.,2020) to

estimate the conﬁdence score of a test query.

All these unsupervised OOD detection methods

only focus on the improvement of a single aspect

of representation learning or scoring function, but

none of them consider how to align representa-

tion learning with scoring functions. For example,

Lin and Xu (2019) proposes a local outlier fac-

tor for OOD detection, which considers the local

density of a test query to determine whether it be-

longs to an OOD intent, but the IND pre-training

objective LMCL (Wang et al.,2018) cannot learn

neighborhood discriminative representations. Xu

et al. (2020); Zeng et al. (2021a) employ a gaussian

discriminant analysis method for OOD detection,

which assumes that the IND cluster distribution is a

gaussian distribution, but they use a cross-entropy

or supervised contrastive learning (Khosla et al.,

2020) objective for representation learning, which

cannot guarantee that such an assumption is satis-

ﬁed. The gap between representation learning and

scoring function limits the overall performance of

these methods.

arXiv:2210.10722v1 [cs.CL] 19 Oct 2022

To solve the conﬂict, in this paper, we pro-

pose a

Uni

ﬁed

eighborhood

earning framework

(

UniNL

) for OOD detection, which aims to align

IND pre-training representation objectives with

OOD scoring functions. Our intuition is to learn

neighborhood knowledge (Breunig et al.,2000a) to

detect OOD intents. For IND pre-training, we intro-

duce a K-Nearest Neighbor Contrastive Learning

Objective (KNCL) to learn neighborhood discrim-

inative representations. Compared to SCL (Zeng

et al.,2021a) which draws all samples of the same

class closer, KNCL only pulls together similar sam-

ples in the neighbors. To align KNCL, we fur-

ther propose a K-nearest neighbor scoring function,

which estimates the test sample conﬁdence score

by computing the average distance between a test

sample and its K-nearest neighbor samples. The

KNCL objective learns neighborhood discrimina-

tive knowledge, which is more beneﬁcial to pro-

moting KNN-based scoring functions.

Our contributions are three-fold: (1) We pro-

pose a uniﬁed neighborhood learning framework

(UniNL) for OOD detection, which aims to match

IND pre-training objectives with OOD scoring

functions. (2) We propose a K-nearest neighbor

contrastive learning (KNCL) objective for IND

pre-training to learn neighborhood discriminative

knowledge, and a KNN-based scoring function to

detect OOD intents. (3) Experiments and analysis

demonstrate the effectiveness of our method for

OOD detection.

2 Approach

Overall Architecture

Fig 1shows the overall ar-

chitecture of our proposed UniNL, which includes

K-nearest contrastive learning (KNCL) and KNN-

based score function. We ﬁrst train an in-domain

intent classiﬁer using our KNCL objective in the

training stage, which aims to learn neighborhood

discriminative representation. Then in the test

stage, we extract the intent feature of a test query

and employ our proposed KNN-based score func-

tion to estimate the conﬁdence score. We aim to

align representation learning and scoring functions.

KNN Contrastive Representation Learning

Existing OOD detection methods generally adopt

cross-entropy (CE) (Xu et al.,2020) and supervised

contrastive learning (SCL) (Zeng et al.,2021a) ob-

jectives for representation learning. Both CE and

SCL tend to bring all samples of the same classes

closer, and samples of different classes are pushed

Embedding Layer

In-domain Utterances

Contextual Encoder

CE Loss KNCL Loss

IND

K=5

OOD Pull Together

Push Away

Threshold

KNN-based

Scoring Function

KNN Contrastive Loss

Figure 1: Overall architecture of UniNL.

away. They learn inter-class discriminative fea-

tures in a global representation space. However,

we ﬁnd that when performing OOD detection, we

care more about the data distribution within the

neighborhood of a given sample. Inspired by Bre-

unig et al. (2000b), we hope to learn neighborhood

discriminative knowledge in the IND pre-training

stage to facilitate OOD detection. We propose a K-

nearest neighborhood contrastive learning (KNCL)

objective to learn discriminative features in a lo-

cal representation space. Given an IND sample

we ﬁrstly obtain its intent representation

using

a BiLSTM (Hochreiter and Schmidhuber,1997)

or BERT (Devlin et al.,2019) encoder. Next, we

perform KNCL as follows:

LKN CL =

i=1

−1

|Nk(i)|

Nk(i)

j=1

1i6=j1yi=yj

log exp (zi·zj/τ)

PNk(i)

k=1 1i6=kexp (zi·zk/τ)

(1)

where

Nk(i)

is the KNN set of

in the represen-

tation space. KNCL only draws closer together

samples of the same class in the neighborhood.

Speciﬁcally, given an anchor, KNCL ﬁrst ﬁnds its

KNN set in a batch, and then selects samples of

the same class as positives, and different classes as

negatives. Similar to Zeng et al. (2021a), we use

an adversarial augmentation strategy to generate

augmented views of the original samples within

a batch. In the implementation, we ﬁrst pre-train

the intent classiﬁer using KNCL, then ﬁnetune the

model using CE, both on the IND data. We leave

the implementation details in the appendix. Sec-

tion 3.3 proves that KNCL learns neighborhood

discriminative knowledge and helps to distinguish

IND from OOD.

KNN-based Score Function

To align with the

KNCL representation learning objective, we pro-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UniNL:AligningRepresentationLearningwithScoringFunctionforOODDetectionviaUniedNeighborhoodLearningYutaoMou1,PeiWang1,KeqingHe2,YananWu1JingangWang2,WeiWu2,WeiranXu11BeijingUniversityofPostsandTelecommunications,Beijing,China2Meituan,Beijing,China{myt,wangpei,yanan.wu,xuweiran}@bupt.edu.cn{hekeq...

展开>> 收起<<

UniNL Aligning Representation Learning with Scoring Function for OOD Detection via Uniﬁed Neighborhood Learning Yutao Mou1 Pei Wang1 Keqing He2 Yanan Wu1.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

UniNL Aligning Representation Learning with Scoring Function for OOD Detection via Uniﬁed Neighborhood Learning Yutao Mou1 Pei Wang1 Keqing He2 Yanan Wu1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: