Learning Better Intent Representations for Financial Open Intent Classification Xianzhi Li12 Will Aitken12 Xiaodan Zhu12and Stephen W. Thomas3

2025-05-02 0 0 241.61KB 10 页 10玖币
侵权投诉
Learning Better Intent Representations for Financial Open Intent
Classification
Xianzhi Li1,2, Will Aitken1,2, Xiaodan Zhu1,2 and Stephen W. Thomas3
1Department of Electrical and Computer Engineering, Queen’s University
2Ingenuity Labs Research Institute, Queen’s University
3Smith School of Business, Queen’s University
{21xl17, will.aitken, xiaodan.zhu, stephen.thomas}@queensu.ca
Abstract
With the recent surge of NLP technologies in
the financial domain, banks and other finan-
cial entities have adopted virtual agents (VA)
to assist customers. A challenging problem
for VAs in this domain is determining a user’s
reason or intent for contacting the VA, espe-
cially when the intent was unseen or open dur-
ing the VAs training. One method for handling
open intents is adaptive decision boundary
(ADB) post-processing, which learns tight de-
cision boundaries from intent representations
to separate known and open intents. We pro-
pose incorporating two methods for supervised
pre-training of intent representations: prefix-
tuning and fine-tuning just the last layer of a
large language model (LLM). With this pro-
posal, our accuracy is 1.63% - 2.07% higher
than the prior state-of-the-art ADB method
for open intent classification on the bank-
ing77 benchmark amongst others. Notably, we
only supplement the original ADB model with
0.1% additional trainable parameters. Abla-
tion studies also determine that our method
yields better results than full fine-tuning the
entire model. We hypothesize that our find-
ings could stimulate a new optimal method
of downstream tuning that combines parame-
ter efficient tuning modules with fine-tuning a
subset of the base model’s layers.
1 Introduction
As the popularity of virtual agent (VA) dialogue
systems increases and their application in the fi-
nance domain is explored, the problem of in-
tent classification demands greater attention. Sev-
eral recent finance-specific VAs leverage techni-
cal advancements to respond to natural language
queries (Galitsky and Ilvovsky,2019;Khan and
Rabbani,2020). Determining the user’s intent en-
sures that the VA can appropriately tailor its re-
sponses and/or perform relevant actions. Initial
works in intent classification limited the task to
classifying utterances as one of Nknown intents
Utterance Label
When will I get my card? Card Arrival
What exchange rates do you offer? Exchange Rate
My card hasn’t arrived yet. Card Arrival
Is it a good time to exchange? Exchange Rates
... ...
Is it possible to get a refund? Open
Why has my withdrawal not posted? Open
Table 1: Example user utterances and associated in-
tent labels from banking77 dataset (Casanueva et al.,
2020). In this example, only Card Arrival and Ex-
change Rate intents were known in training and thus re-
fund and withdrawal related requests are Open intents
in this context.
and achieved high accuracy (Weld et al.,2021).
However, as depicted in Table 1, real-world appli-
cations often encounter intents unseen in training
data that can be considered as open in the current
context. Accounting for the open class establishes
an (N+ 1)-class classification task (Shu et al.,
2017), where the open class is used as a label for
any unidentified intent.
An optimal classifier for this problem must bal-
ance correctly labelling known-class utterances
while avoiding mistakenly classifying open ut-
terances as one of the known classes. (Zhang
et al.,2021a) addresses this problem by proposing
a novel loss function to learn an adaptive decision
boundary (ADB) for each known intent. At in-
ference, samples that do not fall within any ADB
are classified as open. Compact intent represen-
tations are required as input for the ADB post-
processing learning step and in the case of (Zhang
et al.,2021a) the representations are learnt by
fine-tuning the last layer of BERT (Devlin et al.,
2019). Since most intent classification methods
require post-processing on intent representations,
our work focuses on deriving richer representa-
tions by leveraging large language models (LLM)
in an efficacious manner while still minimizing
arXiv:2210.14304v1 [cs.CL] 25 Oct 2022
trainable parameters.
Following the introduction of the transformer in
(Vaswani et al.,2017a), an influx of LLM archi-
tectures have continually progressed state-of-the-
art (SOTA) performance on many natural language
processing (NLP) tasks (Otter et al.,2021). Usu-
ally these models are pre-trained on a general self-
supervised learning task, after which they are fine-
tuned for a specific task. Fine-tuning such a model
can be computationally prohibitive due to the im-
mense number of trainable parameters. Further-
more, (Kaplan et al.,2020) found that the most
important factor for LLM performance is likely
model size, indicating that development of even
larger models is probable. Inspired by in-context
prompting, (Li and Liang,2021) proposed prefix
tuning as a parameter efficient alternative to fine-
tuning for natural language generation (NLG). The
LLM’s parameters are frozen and trainable pre-
fix tokens are prepended to the input sequence.
Prefix-tuning has been adapted to natural language
understanding (NLU) and performs comparably to
full fine-tuning across scales and tasks (Liu et al.,
2022).
We achieve SOTA results by augmenting the
pre-training architecture of ADB open intent clas-
sification (Zhang et al.,2021a) with prefix-tuning.
The combination of prefix-tuning with fine-tuning
only the last transformer layer was motivated by
(Kumar et al.,2022), which discovered that fine-
tuning the entire model can distort pre-trained
features. We find that alone, both prefix-tuning
or fine-tuning the last layer under-performs fine-
tuning all of BERT but when trained in tandem,
exceeds full fine-tuning.
The rest of this paper is structured as follows:
Section 2summarizes prior works in both in-
tent classification and parameter efficient tuning
(PET). Our methodology and model architecture
are defined in Section 3. In Sections 4and 5re-
spectively, we provide our experimentation struc-
ture and corresponding results as well as several
ablations. We finish with a conclusion and brief
discussion regarding limitations and ethics.
2 Related Works
2.1 Financial Virtual Agents
The effectiveness of VAs has led to their adoption
in the financial domain. (Galitsky and Ilvovsky,
2019) demonstrated an exemplary session with a
financial VA where the user queried for invest-
ment advice. CalFE leverages commercial chatbot
frameworks to train a finance-specific VA (Khan
and Rabbani,2020). (Ng et al.,2020) evaluates
the impact of a VAs social presence on usage in-
tention in VAs for finance. All of these works re-
quire extracting intent from user utterances.
2.2 Intent Detection
Intent classification is a well-established NLU task
but most research limits the problem to known
classes (Zhang et al.,2019;E et al.,2019;Qin
et al.,2019;Zhang et al.,2021b). While hav-
ing prior knowledge of all expected intents is
ideal, this is rarely possible in a production en-
vironment, especially for new dialogue systems.
More realistically, a subset of intents are antici-
pated and new intents are discovered after deploy-
ment. (Brychcín and Král,2017) recognized the
challenge of identifying intents prior to training
and proposed an unsupervised method to group
intents, but by doing so, likely ignored informa-
tion available in the already identified intents. (Xia
et al.,2018) employed zero-shot learning to iden-
tify emerging intents but used an LSTM which
is hindered by non-parallelized learning and chal-
lenges in propagating long-range dependencies.
The same issue is present in DeepUnk, a BiLSTM-
based intent classification method using margin
loss (Lin and Xu,2019). (Zhan et al.,2021) shared
our open intent classification problem formulation
but synthetically generated out-of-domain sam-
ples for training which may not be as realistic as a
fine-grained open class representation.
Our work directly extends the ADB approach to
establishing an open class representation (Zhang
et al.,2021a). The novelty of our adaptation is
in leveraging prefix tuning in combination with
partial fine-tuning to improve the pre-training of
known intent representations without drastically
increasing the number of trainable parameters.
In parallel with our work, (Zhang et al.,2022)
extended their ADB approach to learn distance-
aware intent representations. Doing so resulted
in comparable performance to our modification
of their original approach. However, our tuning
method is model-agnostic and can easily be incor-
porated with their distance-aware representation
learning, likely improving the SOTA further.
2.3 Parameter Efficient Tuning
The desire for PET quickly emerged following
the introduction of LLMs. Adapter modules in-
摘要:

LearningBetterIntentRepresentationsforFinancialOpenIntentClassicationXianzhiLi1,2,WillAitken1,2,XiaodanZhu1,2andStephenW.Thomas31DepartmentofElectricalandComputerEngineering,Queen'sUniversity2IngenuityLabsResearchInstitute,Queen'sUniversity3SmithSchoolofBusiness,Queen'sUniversity{21xl17,will.aitken...

展开>> 收起<<
Learning Better Intent Representations for Financial Open Intent Classification Xianzhi Li12 Will Aitken12 Xiaodan Zhu12and Stephen W. Thomas3.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:241.61KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注