Keep Me Updated Memory Management in Long-term Conversations Sanghwan Bae12Donghyun Kwak12Soyoung Kang12Min Young Lee12 Sungdong Kim23Yuin Jeong1Hyeri Kim1Sang-Woo Lee123

2025-05-06 0 0 1.47MB 19 页 10玖币
侵权投诉
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae1,2Donghyun Kwak1,2Soyoung Kang1,2Min Young Lee1,2
Sungdong Kim2,3Yuin Jeong1Hyeri Kim1Sang-Woo Lee1,2,3
Woomyoung Park1,2Nako Sung1
NAVER CLOVA1NAVER AI Lab2KAIST AI3
Abstract
Remembering important information from the
past and continuing to talk about it in the
present are crucial in long-term conversa-
tions. However, previous literature does not
deal with cases where the memorized infor-
mation is outdated, which may cause confu-
sion in later conversations. To address this is-
sue, we present a novel task and a correspond-
ing dataset of memory management in long-
term conversations, in which bots keep track
of and bring up the latest information about
users while conversing through multiple ses-
sions. In order to support more precise and in-
terpretable memory, we represent memory as
unstructured text descriptions of key informa-
tion and propose a new mechanism of memory
management that selectively eliminates inval-
idated or redundant information. Experimen-
tal results show that our approach outperforms
the baselines that leave the stored memory un-
changed in terms of engagingness and human-
ness, with larger performance gap especially in
the later sessions.
1 Introduction
In human interactions, memory is an important
mechanism that helps us hold conversations, de-
velop rapport, and maintain long-term relationships
(Alea and Bluck,2003;Nelson,2003;Brewer et al.,
2017). To this end, recent studies (Wu et al.,2020;
Xu et al.,2022a,b) on open-domain dialogues have
proposed methods to remember and utilize persona
information (Zhang et al.,2018) of the interlocu-
tors obtained from previous conversations. Specifi-
cally, they summarize the persona information in
an extractive or abstractive way and give it as a
condition for generating responses in subsequent
conversations. They show that this feature leads to
better consistency and engagingness of the chatbot
systems.
Despite such progress, an aspect overlooked by
previous studies is that memorized information can
Figure 1: An example of a long-term dialogue. There
is information obtained from an early session that is no
longer true in a later session, e.g. “Got a sore throat”.
This information should be removed from the memory
of later sessions in order to correctly follow up with the
interlocuter.
be invalidated by newly gathered information. They
simply accumulate and maintain the stored infor-
mation in memory; once stored, such information
has no possibility of getting updated in the future.
Memory in real-life conversations, however, can
change over time, either in a short period of time
(e.g. health status, plans for the weekend, or re-
cently watched movie) or in relatively longer pe-
riod of time (e.g. age, job, or hobby). Such memory
needs to be kept track by asking its status again in
subsequent conversations, as exemplified in Figure
1. Therefore, updating previous memory with new
relevant information and maintaining it up-to-date
are important features of human-like long-term con-
versations.
In this work, we study the methods of memoriz-
ing and updating dynamic information and utilizing
arXiv:2210.08750v1 [cs.CL] 17 Oct 2022
them in successive dialogues. We formulate a new
task of memory management in long-term conver-
sations and construct its corresponding dataset
1
,
by extending an existing Korean open-domain dia-
logue dataset (Bae et al.,2022) to multiple sessions
with changing user information. In each session of
our dataset, while the user and the bot have a con-
versation, information about the user is identified
from the dialogue. Then, in successive sessions,
the bot keeps in memory only the information valid
at that point and utilizes the resulting memory in
dialogue.
In addition, we propose a long-term dialogue
system including a novel memory management
mechanism. In this system, information about the
interlocutors revealed in the previous conversation
is abstractively summarized and stored in memory.
Specifically, the memory management mechanism
decides which information to keep in memory. For
this purpose, we define four pairwise operations
(PASS, REPLACE, APPEND, and DELETE) to
find and eliminate the information that can cause
confusion or redundancy in later conversations.
For example, if the previous memory sentence is
“Haven’t got COVID tested yet” and the new in-
coming summary is “Just got positive results from
COVID test”, the two sentences are contradictory,
in which the former needs to be replaced in mem-
ory by the latter. Through this process, only valid
information remains in new memory. Then, in sub-
sequent sessions, a relevant information from this
memory is retrieved and given as additional condi-
tion for generating chatbot responses.
With extensive experiments and ablations, we
show that the proposed memory management mech-
anism becomes more advantageous in terms of
memorability as the sessions proceed, leading
to better engagingness and humanness in multi-
session dialogues.
Our contributions are as follows:
1.
We make a step towards long-term conversa-
tions with dynamic memory that must be kept
up-to-date.
2.
We propose a novel memory management
mechanism in the form of unstructured text
that achieves better results in automatic and
human evaluation over baselines.
1
The dataset is available at
https://github.com/
naver-ai/carecall-memory
3.
We release the first Korean long-term dialogue
dataset for further research on memory man-
agement in dialogues.
2 Related Work
Personalized Dialogue System
Building
human-like open-domain chatbots is one of the
seminal research topics in the field of natural
language processing. Zhang et al. (2020) has
provided a strong backbone generator model
for dialogue systems, while Adiwardana et al.
(2020), Roller et al. (2021) and Thoppilan et al.
(2022) have paved the way for the development
of more human-like, natural-sounding chatbots.
The applications of open-domain chatbots have
also widely expanded, including role-specified
(Bae et al.,2022) and personalized (Zhang et al.,
2018) dialogue systems. In particular, personalized
dialogue system has typically been studied
either via utilizing predefined, explicitly stated
user profile (Zhang et al.,2018), or via directly
extracting user profile from dialogue history (Xu
et al.,2022a,b). While the latter approach is
preferred in recent research works (Zhong et al.,
2022), long-term management of the obtained
information is yet to be studied.
Long-term Memory in Conversation
Because
it is inefficient to use the entire dialogue history as
long-term memory, techniques for obtaining and
managing information from dialogue history have
been studied. Representing latent features as neural
memory (Weston et al.,2015;Tran et al.,2016;
Munkhdalai et al.,2019) used to be a traditional
method. Slot-value format in dialogue state track-
ing (Heck et al.,2020;Hosseini-Asl et al.,2020;
Kim et al.,2020), and graph format in Hsiao et al.
(2020) have been the two major approaches in han-
dling the memorized information in a structured
way. Kim et al. (2020) suggested update operations
on fixed-sized slot-value pairs for dialogue states.
Wu et al. (2020) extracted user attributes from dia-
logues in triples. However, such approaches have
not been demonstrated in a multi-session setting.
Leveraging the advancement of pre-trained lan-
guage models (Devlin et al.,2019;Raffel et al.,
2020;Brown et al.,2020;Kim et al.,2021), re-
cent studies attempt to use the unstructured form
of text as memory, which is expected to be ad-
vantageous in terms of generalizability and inter-
pretability. Ma et al. (2021) and Xu et al. (2022b)
selectively stored dialogue history with relevant
information, while Zhong et al. (2022) employed
refiners to extract fine-grained information from
dialogue history. Xu et al. (2022a) summarized the
dialogue history to avoid overflow and redundancy.
Nevertheless, these works rarely consider that the
obtained information may change and become out-
dated. Specifically, MSC (Xu et al.,2022a) does not
reflect the change of information. In other words,
information in MSC remains fixed once it is stored.
DuLeMon (Xu et al.,2022b) is not formatted in
a multi-session manner, making it impossible to
track memory changes across multiple sessions.
3 Task and Dataset
This section describes the task of long-term con-
versations with dynamic memory changes and the
process of constructing a new dataset to conduct
research on this task.
3.1 Task Definition
An episode consists of multiple consecutive
dialogue sessions with a specific user. Dia-
logue context of the current session is
Dt=
{c1, u1, c2, u2,· · · , ct, ut}
at time step
t
, where
c
and
u
represent the chatbot’s and user’s utterance,
respectively. Natural language memory sentences
M={m1, m2,· · · , mn}
contain user informa-
tion abstracted from the previous sessions of the
same episode. Then, given the dialogue context
Dt
, and memory
M
, we are interested in predict-
ing the chatbot’s response
ct+1
. At the end of each
session, the entire session
D
is summarized into
several sentences of user information, denoted as
S={s1, s2,· · · , sk}
. Memory sentences
M0
for
the next session are constructed by combining
M
and S.
3.2 Dataset Construction
To study this task, we build a new dataset based
on CareCall dataset
2
(Bae et al.,2022), which con-
sists of single sessions of open-domain dialogues
between bots and users. We choose this dataset be-
cause the sessions contain various topics that are
likely to change in a short period of time, such as
user’s health, sleep, and diet, as well as those in
a relatively longer period of time, such as family,
pets, and frequently visited places. We extend this
single-session dataset to a multi-session setting,
which is a similar procedure presented in MSC
(Xu et al.,2022a). Our resulting dataset contains
2https://github.com/naver-ai/carecall-corpus
Statistics
Sessions 7,665
Session 1 2,812
Session 2 2,798
Session 3 743
Session 4 674
Session 5 638
Turns 160,191
Avg. turns per session 20.90
Avg. words per turn 4.93
Unique words for all turns 59,434
Distinct-1/2 for all turns 0.0753/0.2891
Avg. memory sentences per session |M|3.41
Avg. summary sentences per session |S|2.88
Avg. words per summary sentence 4.70
Distinct-1/2 for all summary sentences 0.1425/0.3926
Table 1: Statistics of our CareCallmem dataset. Distinct-
1/2 (Li et al.,2016) is the number of distinct uni- or
bi-brams divided by total number of words.
more persona updates than other datasets (Xu et al.,
2022a,b) (see Section C.1 in Appendix for more
details).
3.2.1 Preliminary Step: Dialogue and
Summary
To efficiently collect the dataset, we train prelimi-
nary models for dialogue summaries and memory
grounded dialogues to first automatically generate
the dataset, and then a group of annotators revise
them. This procedure has shown to be more effec-
tive in recent studies (Sun et al.,2021;Bae et al.,
2022;Liu et al.,2022;Zheng et al.,2022). In the
entire process, we leverage the large-scale language
models (LMs) for each step; HyperCLOVA 6.9B
as backbone LM.
Dialogue Summary
We randomly sample 600
dialogue sessions with more than 15 turns from the
CareCall dataset. We ask annotators to summarize
each session into several sentences to build
S
that
may be useful to continue the next conversation. Us-
ing these summaries, we fine-tune LMs to generate
summaries given dialogues
P(S|D)
. The models
then generate summaries of unseen dialogues ran-
domly sampled from the CareCall dataset. Finally,
annotators edit the generated summaries by filling
in missing information or correcting erroneous sen-
tences. Since there is no memory sentence for the
first session, i.e.
M=
, memory for the second
session M0is equal to S.
Figure 2: The overview of the proposed system. (1) Memory grounded response generation model (Section 4.1)
conditioned on memory sentences Mconverses with human user. (2) At the end of the session, the dialogue sum-
marizer (Section 4.2) summarizes user information into several sentences Sfrom the session history. (3) Memory
operator (Section 4.3) predicts the operations for every (mi, sj)pair to select information to leave, which consists
the next memory M0.
Memory Grounded Dialogue
To build a second
session of each episode, annotators write dialogue
sessions grounded on the 600 human-written sum-
maries from the previous step. Likewise, we fine-
tune LMs to generate the entire dialogue sessions
given previous memory
P(D|M)
. Then, the fine-
tuned models generate memory grounded dialogues
from the unseen dialogue summaries in the pre-
vious paragraph. Lastly, human annotators revise
the generated dialogues, i.e. correcting wrong re-
sponses (misuse of memory, not sensible, or out-
of-bounds from CareCall’s role described in Bae
et al. (2022)).
3.2.2 Interactive Step: Multi-Session
Dialogue
From the preliminary step, we obtain the data to
build a chatbot that can conduct interactive con-
versation utilizing the memorized information. To
construct a multi-session dialogue system, we train
the dialogue summarizer and memory grounded
response generator described in Section 4on pre-
viously collected
(D, S)
pairs with
(M, D)
pairs
respectively.
Then, crowdworkers converse with the resulting
system for 5 sessions per episode, starting from
the first session. The interval between sessions is
assumed to be from 1 to 2 weeks. At the end of
each session, the summarizer generates
S
from the
current session. Both generated responses and sum-
maries are edited by annotators to correct errors.
Lastly, we ask annotators to select which sentences
in
M
and
S
should remain in new memory
M0
for the next session. We provide details of quality
control in Appendix Aand an example episode
in Figure 4in Appendix. We name this dataset as
CareCall
mem
and the statistics of the dataset are
given in Table 1, which includes all the collected
data described in Section 3.2.1-3.2.2.
4 Models
We propose a long-term dialogue system with mem-
ory management mechanism. The system consists
of three parts: memory grounded response genera-
tion, dialogue summarization, and memory update.
The overall architecture is shown in Figure 2.
4.1 Memory Grounded Response Generation
Response Generation
We consider the response
generation model conditioned on memory sen-
tences. Given the memory
M
and the dialogue
history
Dt={c1, u1, c2, u2,· · · , ct, ut}
at time
step
t
, the conditional probability of the next tar-
get response
ct+1 ={w1, w2,· · · , w|ct+1|}
can be
written as the product of a sequence of conditional
probabilities:
p(ct+1|Dt, M) = Y
i
pθ(wi|Dt, M, w<i),(1)
where
wi
is
i
-th token of the sequence and
θ
is
trainable parameters of the model. We use Hyper-
CLOVA 6.9B as the response generation model.
摘要:

KeepMeUpdated!MemoryManagementinLong-termConversationsSanghwanBae1;2DonghyunKwak1;2SoyoungKang1;2MinYoungLee1;2SungdongKim2;3YuinJeong1HyeriKim1Sang-WooLee1;2;3WoomyoungPark1;2NakoSung1NAVERCLOVA1NAVERAILab2KAISTAI3AbstractRememberingimportantinformationfromthepastandcontinuingtotalkaboutitinthepres...

展开>> 收起<<
Keep Me Updated Memory Management in Long-term Conversations Sanghwan Bae12Donghyun Kwak12Soyoung Kang12Min Young Lee12 Sungdong Kim23Yuin Jeong1Hyeri Kim1Sang-Woo Lee123.pdf

共19页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:19 页 大小:1.47MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 19
客服
关注