Learning Better Intent Representations for Financial Open Intent Classiﬁcation Xianzhi Li12 Will Aitken12 Xiaodan Zhu12and Stephen W. Thomas3

2025-05-02 0 0 241.61KB 10 页 10玖币

侵权投诉

Learning Better Intent Representations for Financial Open Intent

Classiﬁcation

Xianzhi Li1,2, Will Aitken1,2, Xiaodan Zhu1,2 and Stephen W. Thomas3

1Department of Electrical and Computer Engineering, Queen’s University

2Ingenuity Labs Research Institute, Queen’s University

3Smith School of Business, Queen’s University

{21xl17, will.aitken, xiaodan.zhu, stephen.thomas}@queensu.ca

Abstract

With the recent surge of NLP technologies in

the ﬁnancial domain, banks and other ﬁnan-

cial entities have adopted virtual agents (VA)

to assist customers. A challenging problem

for VAs in this domain is determining a user’s

reason or intent for contacting the VA, espe-

cially when the intent was unseen or open dur-

ing the VA’s training. One method for handling

open intents is adaptive decision boundary

(ADB) post-processing, which learns tight de-

cision boundaries from intent representations

to separate known and open intents. We pro-

pose incorporating two methods for supervised

pre-training of intent representations: preﬁx-

tuning and ﬁne-tuning just the last layer of a

large language model (LLM). With this pro-

posal, our accuracy is 1.63% - 2.07% higher

than the prior state-of-the-art ADB method

for open intent classiﬁcation on the bank-

ing77 benchmark amongst others. Notably, we

only supplement the original ADB model with

0.1% additional trainable parameters. Abla-

tion studies also determine that our method

yields better results than full ﬁne-tuning the

entire model. We hypothesize that our ﬁnd-

ings could stimulate a new optimal method

of downstream tuning that combines parame-

ter efﬁcient tuning modules with ﬁne-tuning a

subset of the base model’s layers.

1 Introduction

As the popularity of virtual agent (VA) dialogue

systems increases and their application in the ﬁ-

nance domain is explored, the problem of in-

tent classiﬁcation demands greater attention. Sev-

eral recent ﬁnance-speciﬁc VAs leverage techni-

cal advancements to respond to natural language

queries (Galitsky and Ilvovsky,2019;Khan and

Rabbani,2020). Determining the user’s intent en-

sures that the VA can appropriately tailor its re-

sponses and/or perform relevant actions. Initial

works in intent classiﬁcation limited the task to

classifying utterances as one of Nknown intents

Utterance Label

When will I get my card? Card Arrival

What exchange rates do you offer? Exchange Rate

My card hasn’t arrived yet. Card Arrival

Is it a good time to exchange? Exchange Rates

... ...

Is it possible to get a refund? Open

Why has my withdrawal not posted? Open

Table 1: Example user utterances and associated in-

tent labels from banking77 dataset (Casanueva et al.,

2020). In this example, only Card Arrival and Ex-

change Rate intents were known in training and thus re-

fund and withdrawal related requests are Open intents

in this context.

and achieved high accuracy (Weld et al.,2021).

However, as depicted in Table 1, real-world appli-

cations often encounter intents unseen in training

data that can be considered as open in the current

context. Accounting for the open class establishes

an (N+ 1)-class classiﬁcation task (Shu et al.,

2017), where the open class is used as a label for

any unidentiﬁed intent.

An optimal classiﬁer for this problem must bal-

ance correctly labelling known-class utterances

while avoiding mistakenly classifying open ut-

terances as one of the known classes. (Zhang

et al.,2021a) addresses this problem by proposing

a novel loss function to learn an adaptive decision

boundary (ADB) for each known intent. At in-

ference, samples that do not fall within any ADB

are classiﬁed as open. Compact intent represen-

tations are required as input for the ADB post-

processing learning step and in the case of (Zhang

et al.,2021a) the representations are learnt by

ﬁne-tuning the last layer of BERT (Devlin et al.,

2019). Since most intent classiﬁcation methods

require post-processing on intent representations,

our work focuses on deriving richer representa-

tions by leveraging large language models (LLM)

in an efﬁcacious manner while still minimizing

arXiv:2210.14304v1 [cs.CL] 25 Oct 2022

trainable parameters.

Following the introduction of the transformer in

(Vaswani et al.,2017a), an inﬂux of LLM archi-

tectures have continually progressed state-of-the-

art (SOTA) performance on many natural language

processing (NLP) tasks (Otter et al.,2021). Usu-

ally these models are pre-trained on a general self-

supervised learning task, after which they are ﬁne-

tuned for a speciﬁc task. Fine-tuning such a model

can be computationally prohibitive due to the im-

mense number of trainable parameters. Further-

more, (Kaplan et al.,2020) found that the most

important factor for LLM performance is likely

model size, indicating that development of even

larger models is probable. Inspired by in-context

prompting, (Li and Liang,2021) proposed preﬁx

tuning as a parameter efﬁcient alternative to ﬁne-

tuning for natural language generation (NLG). The

LLM’s parameters are frozen and trainable pre-

ﬁx tokens are prepended to the input sequence.

Preﬁx-tuning has been adapted to natural language

understanding (NLU) and performs comparably to

full ﬁne-tuning across scales and tasks (Liu et al.,

2022).

We achieve SOTA results by augmenting the

pre-training architecture of ADB open intent clas-

siﬁcation (Zhang et al.,2021a) with preﬁx-tuning.

The combination of preﬁx-tuning with ﬁne-tuning

only the last transformer layer was motivated by

(Kumar et al.,2022), which discovered that ﬁne-

tuning the entire model can distort pre-trained

features. We ﬁnd that alone, both preﬁx-tuning

or ﬁne-tuning the last layer under-performs ﬁne-

tuning all of BERT but when trained in tandem,

exceeds full ﬁne-tuning.

The rest of this paper is structured as follows:

Section 2summarizes prior works in both in-

tent classiﬁcation and parameter efﬁcient tuning

(PET). Our methodology and model architecture

are deﬁned in Section 3. In Sections 4and 5re-

spectively, we provide our experimentation struc-

ture and corresponding results as well as several

ablations. We ﬁnish with a conclusion and brief

discussion regarding limitations and ethics.

2 Related Works

2.1 Financial Virtual Agents

The effectiveness of VAs has led to their adoption

in the ﬁnancial domain. (Galitsky and Ilvovsky,

2019) demonstrated an exemplary session with a

ﬁnancial VA where the user queried for invest-

ment advice. CalFE leverages commercial chatbot

frameworks to train a ﬁnance-speciﬁc VA (Khan

and Rabbani,2020). (Ng et al.,2020) evaluates

the impact of a VA’s social presence on usage in-

tention in VAs for ﬁnance. All of these works re-

quire extracting intent from user utterances.

2.2 Intent Detection

Intent classiﬁcation is a well-established NLU task

but most research limits the problem to known

classes (Zhang et al.,2019;E et al.,2019;Qin

et al.,2019;Zhang et al.,2021b). While hav-

ing prior knowledge of all expected intents is

ideal, this is rarely possible in a production en-

vironment, especially for new dialogue systems.

More realistically, a subset of intents are antici-

pated and new intents are discovered after deploy-

ment. (Brychcín and Král,2017) recognized the

challenge of identifying intents prior to training

and proposed an unsupervised method to group

intents, but by doing so, likely ignored informa-

tion available in the already identiﬁed intents. (Xia

et al.,2018) employed zero-shot learning to iden-

tify emerging intents but used an LSTM which

is hindered by non-parallelized learning and chal-

lenges in propagating long-range dependencies.

The same issue is present in DeepUnk, a BiLSTM-

based intent classiﬁcation method using margin

loss (Lin and Xu,2019). (Zhan et al.,2021) shared

our open intent classiﬁcation problem formulation

but synthetically generated out-of-domain sam-

ples for training which may not be as realistic as a

ﬁne-grained open class representation.

Our work directly extends the ADB approach to

establishing an open class representation (Zhang

et al.,2021a). The novelty of our adaptation is

in leveraging preﬁx tuning in combination with

partial ﬁne-tuning to improve the pre-training of

known intent representations without drastically

increasing the number of trainable parameters.

In parallel with our work, (Zhang et al.,2022)

extended their ADB approach to learn distance-

aware intent representations. Doing so resulted

in comparable performance to our modiﬁcation

of their original approach. However, our tuning

method is model-agnostic and can easily be incor-

porated with their distance-aware representation

learning, likely improving the SOTA further.

2.3 Parameter Efﬁcient Tuning

The desire for PET quickly emerged following

the introduction of LLMs. Adapter modules in-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LearningBetterIntentRepresentationsforFinancialOpenIntentClassicationXianzhiLi1,2,WillAitken1,2,XiaodanZhu1,2andStephenW.Thomas31DepartmentofElectricalandComputerEngineering,Queen'sUniversity2IngenuityLabsResearchInstitute,Queen'sUniversity3SmithSchoolofBusiness,Queen'sUniversity{21xl17,will.aitken...

展开>> 收起<<

Learning Better Intent Representations for Financial Open Intent Classiﬁcation Xianzhi Li12 Will Aitken12 Xiaodan Zhu12and Stephen W. Thomas3.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learning Better Intent Representations for Financial Open Intent Classiﬁcation Xianzhi Li12 Will Aitken12 Xiaodan Zhu12and Stephen W. Thomas3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: