Conference’17, July 2017, Washington, DC, USA Xiaohan Xu, Xuying Meng, and Yequan Wang
face. In general, a well-designed ESC system is crucial for many
applications, e.g. customer service chats, mental health support,
etc. [
21
]. Compared to the well-researched emotional and empa-
thetic conversation [
19
,
24
,
31
], ESC focuses on reducing users’
emotional stress using various emotional support strategies, such as
Question, Providing Suggestions, etc.
Recently, several works have been proposed to explore the ESC
task. BlenderBot-Joint [
21
] generates a strategy token as a prompt
to guide the desired response. MISC [
36
] uses an o-the-shelf gen-
erative commonsense model, called COMET [
3
], to infer the user’s
mental status, where the COMET can be seen as an external com-
monsense knowledge base. Then, MISC encodes them additionally
and fuses multiple strategies into one response to generate skill-
fully. GLHG [
?
] also utilizes COMET to generate the local intention
of seeker in each dialogue round, but considers the hierarchical
relationship between the seeker’s global situation (summarizing
the condition of the seeker) and the local intention. Although eec-
tive, the commonsense knowledge in COMET need to be carefully
integrated into these models to realize their best potential, and the
external knowledge base requires a great deal of eort to develop.
Further, their model may not be applicable when knowledge base is
updated or application domain is changed. Therefore, in this article,
we emphasize on exploring the existing knowledge in the dataset
and the characteristics of ESC task under the setting of no external
knowledge.
Due to the characteristics of ESC, all existing works still suer
two key issues. First, all of them are limited to the scope of the
current conversation, but ignore the abundant prior knowledge in
global historical conversations. Moreover, they fail to model the
one-to-many mapping relationship of strategy, i.e. not only one
but multiple strategies could be valid for a single context. These
issues lead to the challenge of generating high-quality and diverse
responses. We next explain these two issues separately.
Generally, when we attempt to solve help-seeker’s problems, we
are adept in drawing on related prior knowledge as reference, e.g.
psychologists would consult many prior classical cases relevant
to current case [
25
]. In ESC, instead of external knowledge, there
also exists much prior knowledge to rely on, such as the (1) exem-
plary responses to similar cases and (2) the general order of support
strategies. This prior knowledge has a great reference value to help
explore seeker’s problem and decide the target support strategy.
An explanatory example in Figure 1 illustrates how prior knowl-
edge guides and benets emotional support conversation. (1) The
retrieved context-related responses from historical conversations,
called exemplars, can serve as prior knowledge of response. On
the one hand, some exemplars, e.g. “I think if you talk to ...”, guide
supporter to give more emphasis on the key problem “losing job”,
and thus benet supporter to focus on and explore seeker’s problem.
On the other hand, some exemplars, e.g. “Maybe you can nd ...”,
provide a hint to accurately express the target strategy Providing
suggestions in the sentence pattern starting with “Maybe you”. (2) In
addition to prior knowledge of response, the transition probability
of strategy calculated in training set can act as prior knowledge
to help decide the current strategy. This is because the support
strategies in ESC follow the procedure of three stages (Exploration,
Comforting and Action) [
11
]. Figure 1(c) shows a transition prob-
ability of strategy Self-disclosure. It illustrates that after sharing
the similar diculties they faced, supporters tend to use Providing
suggestions to give advice based on their experience.
Additionally, it is well known that dialogue systems have a one-
to-many problem of generation, i.e. given a single context there
exists multiple valid responses [
43
]. In ESC, the supporter is re-
quired to take reasonable strategies, so there is also a one-to-many
problem of support strategy. As shown in Figure 1 (b), after the
seeker states his problem, the supporter can also employ other
valid strategies except for the frequently used strategy Providing
suggestions. Taking the strategy Question to take a deeper look at
user’s problem or Armation and Reassurance to comfort the user
is also a decent choice. Moreover, adopting various strategies is
benecial to diverse responses. In a nutshell, incorporating prior
knowledge and modeling the one-to-many mapping relationship
of strategy are critical to provide emotional support in ESC task.
To take into account these two signicant characteristics of ESC,
we propose a novel model called
P
ri
o
r
K
nowledge
E
nhanced emo-
tional support conversation with latent variable model (
PoKE
).
The proposed model could not only fully tap the potential of prior
knowledge in terms of exemplars and strategy sequence, but also
model the one-to-many mapping relationship of strategy. First, we
construct prior knowledge of exemplars and strategy sequence be-
fore training. Then we use a ne-tuned dense passage retrieval
(DPR) [
12
] to retrieve a set of responses semantically related to
the input context, and build a rst-order Markov transition matrix
of strategy sequence from training set. To model the one-to-many
mapping relationship of strategy, we introduce conditional varia-
tional autoencoder (CVAE) [
34
] to predict diverse probability distri-
bution of strategy conditioned on current conversation and prior
knowledge of strategy sequence. Furthermore, we assign exemplars
with dierent attentions according to the distribution of strategy
to emphasize those more relevant exemplars. Lastly, we apply the
technique of memory schema to eectively incorporate encoded
prior knowledge and latent variable into decoder for generation.
The key contributions are summarized as follows:
(1) We
explore the emotional support conversation task under the setting
of no external knowledge base and propose a novel model,
PoKE
.
PoKE can promote emotional support conversation by eectively
modeling the prior knowledge in terms of exemplars and strategy
sequence, and the one-to-many mapping relationship of strategy.
(2) We utilize strategy distribution to denoise the exemplars and
apply a memory schema to eectively incorporate encoded infor-
mation into decoder. (3) Experiments on benchmark dataset (i.e.,
ESConv) of ESC task demonstrate that our method is superior to
existing baselines on both automatic evaluation and human evalua-
tion. Compared with the model using external knowledge, PoKE
still can make a slight improvement in some metrics. (4) Impor-
tantly, we reveal that abundant prior knowledge is conducive to
high-quality emotional support, and a well-learned latent variable
is critical to the diversity of generations.
2 RELATED WORK
In this section, we rst detail some existing proposed methods
for the emotional support conversation. Then, because we utilize
retrieved exemplars to guide generation and take a latent variable to