ness and I am responsible to cheer you up. Thus,
the explicit self-other awareness plays pivotal roles
in disentangling feelings and views of the self and
the other, which constitutes a crucial perspective of
empathy and largely contributes to generate more
empathetic responses, especially when the other is
in negative emotional states.
To this end, we propose to generate
Emp
athetic
response with explicit
S
elf-
O
ther
A
wareness
(
EmpSOA
). Inspired by the conceptual frame-
work of information flow involved in human em-
pathy (Decety and Lamm,2006), we make such
processes computable and abstract three stages
in EmpSOA, named Self-Other Differentiation
(SOD), Self-Other Modulation (SOM) and Self-
Other Generation (SOG). Specifically, in SOD, we
construct two heterogeneous graphs with four types
of nodes to maintain the self-awareness represen-
tation and other-awareness representation, respec-
tively. Among them, commonsense knowledge
from COMET (Bosselut et al.,2019) is leveraged
to manifest the fine-grained emotional and cog-
nitive statuses of the self and the other. Further,
we dynamically control the contributions of the
self-other awareness representations in SOM and
inject them into the process of empathetic response
generation in SOG. Experimental results of both au-
tomatic and manual evaluations on the benchmark
dataset demonstrate the superiority of EmpSOA to
generate more empathetic responses.
The main contributions of this work are summa-
rized as follows:
•
We propose to generate empathetic responses
via explicit self-other awareness, which con-
stitutes a critical perspective of empathy.
•
We devise a novel model EmpSOA to clearly
maintain, modulate and inject the self-other
aware information into the process of empa-
thetic response generation.
•
Results of extensive experiments on the bench-
mark dataset demonstrate the effectiveness of
EmpSOA to identify the exact emotion of the
other and generate more empathetic response.
2 Related Work
2.1 Empathetic Response Generation
Endowing empathy to the dialogue systems has
gained more and more attentions recently. For pre-
vious attempts on empathetic response generation,
we divide them into two categories according to
whether they incorporate both affection and cog-
nition aspects of empathy. On the one hand, most
existing works (Alam et al.,2018;Rashkin et al.,
2019;Lin et al.,2019;Majumder et al.,2020;Li
et al.,2020,2022;Wang et al.,2022) only con-
sider the affective aspect of empathy to understand
the emotional state of the other and converge emo-
tionally. On the other hand, Sabour et al. (2022)
propose to comprehensively understand the emo-
tional feelings and cognitive situations of the other
with commonsense knowledge incorporated.
However, all previous methods only perceive
the emotional or cognitive states of the other by
the single other-awareness, ignoring to explicitly
incorporate self-awareness to make an appropriate
empathetic response with own views of the self.
2.2 Emotional Dialogue Generation
Emotion has been proven to be the key factor of
achieving more engaging dialogue systems. Previ-
ous works explore two ways of incorporating emo-
tion into dialogue generation. On the one hand, the
generation-based methods (Zhou et al.,2018;Zhou
and Wang,2018;Shen and Feng,2020) are pro-
posed to generate emotional responses given a spec-
ified emotion label. On the other hand, retrieval-
based (Qiu et al.,2020;Lu et al.,2021) methods
aim to obtain emotional responses from candidates
retrieved from the response repository. However,
expressing the specified emotion in responses is
merely the fundamental goal to achieve emotional
dialogue systems, which is lack of the understand-
ing for user’s feelings and situations required by
the empathetic response generation.
3 Methodology
3.1 Task Definition
First, we define the task of empathetic response
generation. Formally, let
D= [X1, X2,· · · , XN]
denotes a dialogue history with
N
utterances be-
tween the user (the other) and the system (the self),
where the
i
-th utterance
Xi= [wi
1, wi
2· · · , wi
m]
is
a sequence of
m
words. Besides, each conversation
is provided with an emotion label
e
from the total
32 available emotions to signal what the emotional
tone that the other is grounded on. The goal is to
generate the next utterance
Y
from the stand of the
self that is coherent to the dialogue history
D
and
empathetic to the other’s situation and feeling.