PENTATRON PErsonalized coNText-Aware Transformer for Retrieval-based cOnversational uNderstanding Niranjan Uma NareshZiyan JiangAnkit

2025-05-02 0 0 1.39MB 9 页 10玖币
侵权投诉
PENTATRON: PErsonalized coNText-Aware Transformer for
Retrieval-based cOnversational uNderstanding
Niranjan Uma NareshZiyan JiangAnkit
Sungjin Lee Jie Hao Xing Fan Chenlei Guo
Amazon
{niumanar, ziyjiang, ankitvys, sungjinl, jieha, fanxing, guochenl}@amazon.com
Abstract
Conversational understanding is an integral
part of modern intelligent devices. In a large
fraction of the global traffic from people using
smart digital assistants, frictions in dialogues
may be attributed to incorrect understanding
of the entities in a user’s query due to fac-
tors including ambiguous mentions, mispro-
nunciation, background noise and faulty on-
device signal processing. Such errors are com-
pounded by two common deficiencies from in-
telligent devices namely, (1) the device not be-
ing tailored to individual users, and (2) the de-
vice responses being unaware of the context in
the conversation session . Viewing this prob-
lem via the lens of retrieval-based search en-
gines, we build and evaluate a scalable entity
correction system, PENTATRON. The system
leverages a parametric transformer-based lan-
guage model to learn patterns from in-session
user-device interactions coupled with a non-
parametric personalized entity index to com-
pute the correct query, which aids downstream
components in reasoning about the best re-
sponse. In addition to establishing baselines
and demonstrating the value of personalized
and context-aware systems, we use multitask-
ing to learn the domain of the correct en-
tity. We also investigate the utility of language
model prompts. Through extensive experi-
ments, we show a significant upward move-
ment of the key metric (Exact Match) by up
to 500.97% (relative to the baseline).
1 Introduction
Intelligent devices are ubiquitous in the modern
computing. The scientific modules that drive these
devices involve conversational understanding, am-
bient computing, natural language reasoning and
self-learning (Thoppilan et al.,2022;Sarikaya,
2022;Pinhanez et al.,2021;Liu et al.,2021). A
user’s interaction with a device, however, is suscep-
tible to errors arising from a myriad of sources
including wrong pronunciation, inaccuracies in
Equal contribution.
the subject mentions in a sentence, environmen-
tal noise, hardware and software error (Kim et al.,
2020). Correct interpretations of user queries, espe-
cially entities, is central to delivering the best user
experience. Two important factors that contribute
strongly to high-precision entity recognition are
(1) personalization, ie, learning users’ unique pat-
terns, and (2) contextualization, ie, deriving cues
from the information in a user-device interaction
session. In this paper, we design and evaluate an
entity correction system, PENTATRON, with both
personalization and contextualization baked into
its architecture.
Figure 1: (Above) One multi-turn dialogue session
with defective source query which contains one erro-
neous entity ‘wallace’ and its successful rephrase with
correct entity ‘wallows’. (Below) Concatenation of
queries and responses using special tokens to form a
single sequence as encoder input.
arXiv:2210.12308v1 [cs.LG] 22 Oct 2022
1.1 Motivation
In Figure 1, we illustrate a real-world case as to
why personalization and contextualization are very
important, especially due to the specificity in highly
entity-centric domains such as music. In this case,
masking the very last device response, we observe
that there is valuable information scattered across
the user’s requests in the session yet, the device de-
livers sub-par experience by responding defectively
multiple times before finally getting the user’s in-
tent right.
1.2 Notation and Preliminaries
Definition 1.
Let integer
γ
satisfy
1γ <
.
A natural language (NL) hypothesis is a mapping,
h:QD×I×[E]γ
, where
Q
refers to the
query space,
D
refers to the domain space,
I
refers
to the intent space and
E
refers to the entity space.
The entity space,
E:= ET×EV
, may further be
decomposed into the entity type space
ET
and the
entity value space
EV
. All spaces are defined over
Unicode strings.
As an example, given a query string
q=
play the
real slim shady”, the corresponding NL hypothesis
is
h(q) =
(Music, PlayMusicIntent, [(SongName,
the real slim shady)]) where the domain is Music,
the intent is PlayMusicIntent, and the entity value
is the real slim shady with SongName entity type.
Definition 2.
Building on Definition 1, our system,
PENTATRON, may be formalized as
Φ:(C, Q)
EV
where
C
is the user space (anonymized using
a hash function, for privacy, in practice).
In a nutshell, given an input query
q
(with or
without dialogue context), our system essentially
solves the optimization problem,
min
θ
E(c,q,e)∼D [`θ(c, q), e)] (1)
where Dis supported on C×Q×EV.
1.3 Our Contributions and Preview of
Results
On the system design front, we build a retrieval-
based pipeline. Our model backbone is inspired by
attention-based (Vaswani et al.,2017) transformer
encoders (Devlin et al.,2018). We achieve per-
sonalization via a non-parametric index which is
essentially a key-value pair look-up table with the
keys representing users and values representing
the entity lists derived from historical data aggre-
gation. With respect to experimental results, we
Figure 2: Preview of the system performance which
shows consistent significant improvement in going
from a purely personalized system (N) to a fully con-
textual personalized system (CC). Further details are
available in Table 1.
conduct extensive studies on seven different ver-
sions of PENTATRON, involving ablations with
prompts, multi-tasking and non-contextual train-
ing data, and show consistent improvements in Ex-
act Match (EM) of up to 500.97% (relative to the
baseline) as captured by the preview of results in
Figure 2.
2 Background and Related Work
2.1 Query Rewriting
Query Rewriting (QR) in dialogue systems aims
to reduce frictions by reformulating the automatic
speech recognition component’s interpretation of
users’ queries. Initial efforts (Dehghani et al.,2017;
Su et al.,2019) treat QR as a text generation prob-
lem.
Some recent studies (Chen et al.,2020;Yuan
et al.,2021;Fan et al.,2021;Cho et al.,2021) are
based on neural retrieval systems. In the retrieval-
based systems, the rewrite candidate pool is aggre-
gated from users’ habitual or historical queries so
that the rewrite quality can be tightly controlled.
Compared to generation-based systems, retrieval-
based systems may sacrifice flexibility and diver-
sity of the rewrites, but in the meanwhile provide
more stability which is more important in a runtime
production setup.
Personalization
and
Contextualization
are
two popular directions for QR systems. A per-
sonalized system such as Cho et al.,2021 tends to
incorporate diverse affinities and personal prefer-
ences to provide individually tailored user experi-
ence in a single unified system. Contextualization
attempts to utilize multi-turn queries rather than
only leveraging single-turn information. Some pre-
摘要:

PENTATRON:PErsonalizedcoNText-AwareTransformerforRetrieval-basedcOnversationaluNderstandingNiranjanUmaNareshZiyanJiangAnkitSungjinLeeJieHaoXingFanChenleiGuoAmazon{niumanar,ziyjiang,ankitvys,sungjinl,jieha,fanxing,guochenl}@amazon.comAbstractConversationalunderstandingisanintegralpartofmodernintel...

展开>> 收起<<
PENTATRON PErsonalized coNText-Aware Transformer for Retrieval-based cOnversational uNderstanding Niranjan Uma NareshZiyan JiangAnkit.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:1.39MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注