
them in successive dialogues. We formulate a new
task of memory management in long-term conver-
sations and construct its corresponding dataset
1
,
by extending an existing Korean open-domain dia-
logue dataset (Bae et al.,2022) to multiple sessions
with changing user information. In each session of
our dataset, while the user and the bot have a con-
versation, information about the user is identified
from the dialogue. Then, in successive sessions,
the bot keeps in memory only the information valid
at that point and utilizes the resulting memory in
dialogue.
In addition, we propose a long-term dialogue
system including a novel memory management
mechanism. In this system, information about the
interlocutors revealed in the previous conversation
is abstractively summarized and stored in memory.
Specifically, the memory management mechanism
decides which information to keep in memory. For
this purpose, we define four pairwise operations
(PASS, REPLACE, APPEND, and DELETE) to
find and eliminate the information that can cause
confusion or redundancy in later conversations.
For example, if the previous memory sentence is
“Haven’t got COVID tested yet” and the new in-
coming summary is “Just got positive results from
COVID test”, the two sentences are contradictory,
in which the former needs to be replaced in mem-
ory by the latter. Through this process, only valid
information remains in new memory. Then, in sub-
sequent sessions, a relevant information from this
memory is retrieved and given as additional condi-
tion for generating chatbot responses.
With extensive experiments and ablations, we
show that the proposed memory management mech-
anism becomes more advantageous in terms of
memorability as the sessions proceed, leading
to better engagingness and humanness in multi-
session dialogues.
Our contributions are as follows:
1.
We make a step towards long-term conversa-
tions with dynamic memory that must be kept
up-to-date.
2.
We propose a novel memory management
mechanism in the form of unstructured text
that achieves better results in automatic and
human evaluation over baselines.
1
The dataset is available at
https://github.com/
naver-ai/carecall-memory
3.
We release the first Korean long-term dialogue
dataset for further research on memory man-
agement in dialogues.
2 Related Work
Personalized Dialogue System
Building
human-like open-domain chatbots is one of the
seminal research topics in the field of natural
language processing. Zhang et al. (2020) has
provided a strong backbone generator model
for dialogue systems, while Adiwardana et al.
(2020), Roller et al. (2021) and Thoppilan et al.
(2022) have paved the way for the development
of more human-like, natural-sounding chatbots.
The applications of open-domain chatbots have
also widely expanded, including role-specified
(Bae et al.,2022) and personalized (Zhang et al.,
2018) dialogue systems. In particular, personalized
dialogue system has typically been studied
either via utilizing predefined, explicitly stated
user profile (Zhang et al.,2018), or via directly
extracting user profile from dialogue history (Xu
et al.,2022a,b). While the latter approach is
preferred in recent research works (Zhong et al.,
2022), long-term management of the obtained
information is yet to be studied.
Long-term Memory in Conversation
Because
it is inefficient to use the entire dialogue history as
long-term memory, techniques for obtaining and
managing information from dialogue history have
been studied. Representing latent features as neural
memory (Weston et al.,2015;Tran et al.,2016;
Munkhdalai et al.,2019) used to be a traditional
method. Slot-value format in dialogue state track-
ing (Heck et al.,2020;Hosseini-Asl et al.,2020;
Kim et al.,2020), and graph format in Hsiao et al.
(2020) have been the two major approaches in han-
dling the memorized information in a structured
way. Kim et al. (2020) suggested update operations
on fixed-sized slot-value pairs for dialogue states.
Wu et al. (2020) extracted user attributes from dia-
logues in triples. However, such approaches have
not been demonstrated in a multi-session setting.
Leveraging the advancement of pre-trained lan-
guage models (Devlin et al.,2019;Raffel et al.,
2020;Brown et al.,2020;Kim et al.,2021), re-
cent studies attempt to use the unstructured form
of text as memory, which is expected to be ad-
vantageous in terms of generalizability and inter-
pretability. Ma et al. (2021) and Xu et al. (2022b)
selectively stored dialogue history with relevant