Team Flow at DRC2022 Pipeline System for Travel Destination Recommendation Task in Spoken Dialogue Ryu Hirai1 Atsumoto Ohashi1 Ao Guo1 Hideki Shiroma1 Xulin Zhou1

2025-04-24 0 0 348.25KB 4 页 10玖币

侵权投诉

Team Flow at DRC2022: Pipeline System for Travel Destination

Recommendation Task in Spoken Dialogue

Ryu Hirai1∗, Atsumoto Ohashi1∗, Ao Guo1, Hideki Shiroma1, Xulin Zhou1,

Yukihiko Tone1, Shinya Iizuka2and Ryuichiro Higashinaka1

Abstract— To improve the interactive capabilities of a dia-

logue system, e.g., to adapt to different customers, the Dialogue

Robot Competition (DRC2022) was held. As one of the teams,

we built a dialogue system with a pipeline structure containing

four modules. The natural language understanding (NLU) and

natural language generation (NLG) modules were GPT-2 based

models, and the dialogue state tracking (DST) and policy

modules were designed on the basis of hand-crafted rules. After

the preliminary round of the competition, we found that the

low variation in training examples for the NLU and failed

recommendation due to the policy used were probably the main

reasons for the limited performance of the system.

I. INTRODUCTION

With the popularization of human-machine dialogue, a

dialogue system is expected to achieve objectives in various

situations, e.g., respond to different customers appropriately

in a customer service task. To improve the interactive ca-

pabilities of a dialogue system, Dialogue Robot Competition

2022 (DRC2022) [1] was held following the past competition

[2]. Each team was required to develop a dialogue system

embedded within a humanoid robot to handle the “travel

destination recommendation task.” In the task, the robot plays

the role of a counter-sales person to satisfy the customer by

helping him/her choose one of two tourist attractions.

This paper reports the work of the team “Flow” in

DRC2022. Our dialogue system was built with a pipeline

composed of four modules: natural language understanding

(NLU), dialogue state tracking (DST), policy, and natural

language generation (NLG). By conﬁguring the system in

a pipeline fashion, (1) it is easy to tune the functionality

of each module, and (2) in the future, we can expect to

introduce a method, such as [3], to integrate all modules and

optimize the dialogue performance of the entire system. We

built the NLU and NLG modules by ﬁne-tuning GPT-2 [4], a

popular large-scale language model, with our crowdsourced

data for the travel destination recommendation task. We

further designed the DST and policy modules with hand-

crafted rules.

After the preliminary round of the competition, we exam-

ined the evaluation results and dialogue histories. We found

two main reasons for the limited performance: (1) the NLU

was not able to track the customer dialogue acts properly due

to the low variation in training examples for GPT-2, and (2)

1Graduate School of Informatics, Nagoya University, Japan

hirai.ryu.k6@s.mail.nagoya-u.ac.jp

2School of Informatics, Nagoya University, Japan

∗Equal contribution.

NLU

NLG

DST

Policy

ASR

TTS

Customer utterance (text) Customer DA

System utterance (text)

Dialogue state

Customer utterance

(speech)

System utterance

(speech) System DA

Customer

Fig. 1. Diagram of pipeline structure of our spoken dialogue system.

At each turn, customer’s speech recognition results obtained by automatic

speech recognition (ASR) are processed by NLU, DST, policy, and NLG

to generate system’s response text, which is ﬁnally converted to speech by

text-to-speech (TTS) to respond to customer.

the rules of the policy resulted in a recommendation strategy

that ignored customers’ preferences.

II. IMPLEMENTATION

Fig. 1 shows the pipeline structure of the spoken dialogue

system our team implemented. At each turn, the customer’s

speech input to the robot is converted into text by the

automatic speech recognition (ASR) module, and the utter-

ance text is input to the NLU module. The NLU predicts

the customer’s dialogue act (DA), which is a semantic

representation of the customer’s utterance. The DST module

then updates the dialogue state on the basis of the customer’s

DA. The dialogue state consists of information such as the

history of the DAs, the customer proﬁle, and the belief state,

which is a set of customers’ preferences toward travel. The

policy module decides the next action to be taken by the

system as the system’s DA by using the dialogue state and

information on tourist attractions from the database. The

NLG module then converts the system’s DA into a system

utterance. Finally, the text-to-speech (TTS) module responds

to the customer by converting the text response to speech.

ASR and TTS were implemented using the Google Speech

Recognition system and Amazon Polly API, respectively,

which were prepared by the competition organizers. The

robot’s expression control and motion control were based

on the expression and motion rules deﬁned for each system

dialogue act.

In the following sections, ﬁrst, we describe the ontology

of DA for the travel destination recommendation task, NLU,

DST, policy, and NLG modules. Then, we describe the

robot’s facial expression control and motion control. Finally,

we describe and discuss the evaluation results.

arXiv:2210.09518v1 [cs.CL] 18 Oct 2022

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TeamFlowatDRC2022:PipelineSystemforTravelDestinationRecommendationTaskinSpokenDialogueRyuHirai1,AtsumotoOhashi1,AoGuo1,HidekiShiroma1,XulinZhou1,YukihikoTone1,ShinyaIizuka2andRyuichiroHigashinaka1AbstractToimprovetheinteractivecapabilitiesofadia-loguesystem,e.g.,toadapttodifferentcustomers,theDia...

展开>> 收起<<

Team Flow at DRC2022 Pipeline System for Travel Destination Recommendation Task in Spoken Dialogue Ryu Hirai1 Atsumoto Ohashi1 Ao Guo1 Hideki Shiroma1 Xulin Zhou1.pdf

共4页,预览1页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Team Flow at DRC2022 Pipeline System for Travel Destination Recommendation Task in Spoken Dialogue Ryu Hirai1 Atsumoto Ohashi1 Ao Guo1 Hideki Shiroma1 Xulin Zhou1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: