Transformer-Based Conditioned Variational Autoencoder for Dialogue Generation Huihui Yang

2025-05-06 0 0 1.06MB 8 页 10玖币
侵权投诉
Transformer-Based Conditioned Variational Autoencoder for Dialogue
Generation
Huihui Yang
Zhejiang University
yanghh0@zju.edu.cn
Abstract
In human dialogue, ang one query usually
elicits numerous appropriate responses. The
Transformer-based dialogue model produces
frequently occurring sentences in the corpus
since it is a one-to-one mapping function.
CVAE is a technique for reducing generic
replies. In this paper, we create a dialogue
model (CVAE-T) based on the Transformer
with CVAE structure. We use a pre-trained
MLM model to rewrite some key n-grams in
responses to obtain a series of negative exam-
ples, and introduce a regularization term dur-
ing training to explicitly guide the latent vari-
able in learning the semantic differences be-
tween each pair of positive and negative ex-
amples. Experiments suggest that the method
we design is capable of producing more infor-
mative replies.
1 Introduction
The training data used to train the dialogue mod-
els contains a great deal of unknown background
information, making the dialogue a one-to-many
problem where different people can come up with
different but reasonable answers to the same ques-
tion. Generative diversity is a crucial characteris-
tic for building dialogue systems. Zhao et al. 2017
use CVAE for dialogue modeling and demonstrate
that the sentences produced by CVAE model are
more diverse than those produced by conventional
sequence-to-sequence model.
For CVAE model, the approximate posterior
carries little useful information at the beginning
of training, and the model tends to fit the distri-
bution directly without reference to the latent vari-
able, which is known as the KL-vanishing (Bow-
man et al.,2016). In order to alleviate this
problem, some researchers introduce dialogue in-
tent labels (Zhao et al.,2017) or sentence func-
tion labels (interrogative, declarative and imper-
ative) (Ke et al.,2018) as additional information
to supervise the posterior network learning. How-
ever, this method has many drawbacks: 1) It is ex-
pensive to annotate labels and challenging to ex-
pand to large-scale datasets. 2) It only focuses on
the attributes of a certain aspect of sentences, and
the limited number of tags are difficult to cover
all the attributes of that aspect. 3) The tags them-
selves do not carry semantic information, which
is not conducive to model learning. We observe
that some key words or phrases in the sentence can
serve as representations of high-level sentence at-
tributes, disregarding the need for additional tags.
We locate the key n-grams in each response using
a keyword extraction algorithm and replace them
with a special token [MASK] respectively. These
masked positions are rewritten by a pre-trained
MLM model to generate a series of negative sen-
tences semantically distinct from the original sen-
tence. A regularization term is used to constrain
the prior and posterior distribution during training,
helping the latent variable to perceive the differ-
ence between positive and negative examples.
Dialogue models should be able to handle long
dependencies well because conversation datasets
usually contain multiple rounds of sentences,
and as the conversation goes on, the dialogue
history accumulates into a very long sequence.
Transformer-based models (Zhang et al.,2020;
Roller et al.,2020) have shown strong genera-
tive power when trained on large-scale conversa-
tional corpora. Due to its self-attention mech-
anism and excellent parallelism, Transformer is
suited for processing prolonged sequences. Its hi-
erarchical structure also enables the decoder to in-
corporate latent variable in a more flexible man-
ner. We choose Transformer as encoder-decoder
framework and explore how the CVAE structure
could be better integrated with it for dialogue gen-
eration.
The contributions of this paper can be summa-
rized as follows:
arXiv:2210.12326v1 [cs.CL] 22 Oct 2022
We design a Transformer-based conditioned
variational autoencoder for dialogue genera-
tion, named CVAE-T.
We utilize a simple and effective method
to prompt the latent variable learning more
meaningful distribution.
Experiments illustrate that CVAE-T achieves
significant improvements on diversity and in-
formativeness for dialogue generation.
2 Model
In this section, we first define our task and then
describe in detail the components of our designed
model. The overall architecture of our model is
illustrated in Figure 2.
2.1 Task Definition
In our definition of dyadic conversation task, there
are three elements: dialogue context c, response
yand latent variable z. Dialogue context cis the
concatation of conversation history hand current
query x, which can be denoted as c= [h;x]. La-
tent variable zis utilized to model the probability
distribution of different potential factors influenc-
ing conversation generation. The dialogue genera-
tion task can be expressed as the following condi-
tional probability:
p(y, z|c) = p(z|c)·p(y|c, z)(1)
p(z|c)represents the sampling process of the la-
tent variable z, which is approximated by a neu-
ral network called prior network with parameter
θ.p(y|c, z)represents the decoding process and
is approximated by the decoder network in the
encoder-decoder framework.
2.2 Input Representation
Besides the word embedding and position embed-
ding used in the original Transformer, we also em-
ploy another two kinds of embedding, turn embed-
ding and role embedding, similar to PLATO (Bao
et al.,2020) to represent a token. Each utterance
xin a dialogue context cis assigned a turn id,
decreasing sequentially from the maximum turns
to 1, and the turn id of yis always set to 0. As
there are two speakers in each dialog episode for
our task, we set two role ids: 0 for the person who
speaks first and 1 for the other. The final input em-
bedding of a token is the sum of the corresponding
word, position, turn and role embedding, as shown
in Figure 3.
Figure 1: Negative sentences generation process.
2.3 Negative Sentences Generation
We use the Yake (Yet Another Keyword Extractor)
algorithm (Campos et al.,2018), which is an unsu-
pervised method, to extract the keywords for each
response. Yake creates a set of five features to cap-
ture the unique characteristics of each term. These
are as follows: (1) Casing, (2) Word Positional, (3)
Word Frequency, (4) Word Relatedness to Con-
text, and (5) Word DifSentence. Inspired by the
generation of pseudo-QE data in machine transla-
tion quality evaluation task, we select BERT (De-
vlin et al.,2019) to rewrite the keywords found. If
mkey n-grams are found in a response, we sub-
stitute those n-grams with token [MASK] respec-
tively to create msentences. We then feed these m
sentences into BERT and sample the model output
at each masked position to infill all masked sen-
tences. The negative sentences generation process
is shown in Figure 1.
2.4 Encoder
Encoder block in our model obeys the original
Transformer (Vaswani et al.,2017), which in-
cludes two key subcomponents: multi-head atten-
tion and feed-forward network. Each subcompo-
nent is accompanied with a residual connection
and layer normalization. During training, the en-
coder needs to be called three times to encode
three different types of input data. They are as
follows: (1) dialogue context c, (2) the concata-
tion of dialogue context cand response y, (3) the
concatation of dialogue context cand negative re-
sponse y.yis chosen randomly from the list of
negative sentences of y.
摘要:

Transformer-BasedConditionedVariationalAutoencoderforDialogueGenerationHuihuiYangZhejiangUniversityyanghh0@zju.edu.cnAbstractInhumandialogue,angonequeryusuallyelicitsnumerousappropriateresponses.TheTransformer-baseddialoguemodelproducesfrequentlyoccurringsentencesinthecorpussinceitisaone-to-onemappi...

展开>> 收起<<
Transformer-Based Conditioned Variational Autoencoder for Dialogue Generation Huihui Yang.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:1.06MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注