Schema Encoding for Transferable Dialogue State Tracking

2025-04-15 3 0 569.5KB 12 页 10玖币

侵权投诉

Hyunmin Jeon

Computer Science and Engineering

POSTECH, Pohang, South Korea

jhm9507@postech.ac.kr

Gary Geunbae Lee

Computer Science and Engineering

Graduate School of Artiﬁcial Intelligence

POSTECH, Pohang, South Korea

gblee@postech.ac.kr

Abstract

Dialogue state tracking (DST) is an essential

sub-task for task-oriented dialogue systems.

Recent work has focused on deep neural mod-

els for DST. However, the neural models re-

quire a large dataset for training. Furthermore,

applying them to another domain needs a new

dataset because the neural models are gener-

ally trained to imitate the given dataset. In

this paper, we propose Schema Encoding for

Transferable Dialogue State Tracking (SET-

DST), which is a neural DST method for ef-

fective transfer to new domains. Transferable

DST could assist developments of dialogue

systems even with few dataset on target do-

mains. We use a schema encoder not just to im-

itate the dataset but to comprehend the schema

of the dataset. We aim to transfer the model

to new domains by encoding new schemas and

using them for DST on multi-domain settings.

As a result, SET-DST improved the joint accu-

racy by 1.46 points on MultiWOZ 2.1.

1 Introduction

The objective of task-oriented dialogue systems is

to help users achieve their goals by conversations.

Dialogue state tracking (DST) is the essential sub-

task for the systems to perform the purpose. Users

may deliver the details of their goals to the sys-

tems during the conversations, e.g., what kind of

food they want the restaurant to serve and at what

price level they want to book the hotel. Thus, the

systems should exactly catch the details from utter-

ances. They should also communicate with other

systems by using APIs to achieve users’ goals, e.g.,

to search restaurants and to reserve hotels. The goal

of DST is not only to classify the users’ intents but

also to ﬁll the details into predeﬁned templates that

are used to call APIs.

Recent work has used deep neural networks for

DST with supervised learning. They have im-

proved the accuracy of DST; however, they require

a large dataset for training. Furthermore, they need

a new dataset to be trained on another domain. Un-

fortunately, the large dataset for training a DST

model is not easy to be developed in real world.

The motivation of supervised learning is to make

deep neural networks imitate humans. But, they ac-

tually imitate the given datasets rather than humans.

Someones who have performed hotel reservation

work could easily perform restaurant reservation

work if some guidelines are provided, but neural

models may have to be trained on a new dataset

of the restaurant domain. The difference between

humans and neural models is that humans can learn

how to read guidelines and to apply the guidelines

to their work. This is why transfer learning is im-

portant to train neural models on new domains.

In this paper, we propose

chema

ncoding for

ransferable

ialogue

tate

racking (SET-DST),

which is a neural DST method with transfer learn-

ing by using dataset schemas as guidelines for DST.

The motivation of this study is that humans can

learn not only how to do their work, but also how to

apply the guidelines to the work. We aim to make a

neural model learn how to apply the schema guide-

lines to DST beyond how to ﬁll predeﬁned slots

by simply imitating the dataset on multi-domain

settings. The schema includes metadata of the

dataset, e.g., which domains the dataset covers and

which slots have to be ﬁlled to achieve goals. SET-

DST has a schema encoder to represent the dataset

schema, and it uses the schema representation to

understand utterances and to ﬁll slots. Recently,

transfer learning has been becoming important be-

cause development of new datasets is costly. Trans-

fer learning makes it possible to pre-train neural

models on large-scale datasets to effectively ﬁne-

tune the models on small-scale downstream tasks.

We used SGD (Rastogi et al.,2020) as the large-

scale dataset, and evaluated SET-DST on Multi-

WOZ 2.1 (Eric et al.,2020), which is a standard

benchmark dataset for DST, as the downstream

task. SET-DST achieved state-of-the-art accuracy

arXiv:2210.02351v1 [cs.CL] 5 Oct 2022

Schema Encoder State Generator (GPT-2)

Dataset Schema ... ...

(a) Schema encoding for active slots and intents clas-

siﬁcation.

... ...

State Generator (GPT-2)

... ...

...

(b) Dialogue state generation.

Figure 1: Overview of SET-DST. The schema encoder takes the dataset schema and generates slot vectors and

intent vectors. The state generator takes the previous dialogue state Dt−1and the dialogue history Htto calculate

active scores of slots and intents. Fis an score function to calculate whether the slots or intents are activated on

turn t. Then, the state generator additionally takes the activated slots and intents to generate the current dialogue

state Dt.Stindicates the activated slots and Itindicates the activated intents.

on the downstream DST task. We further con-

ﬁrmed that SET-DST worked well on the small

downstream dataset. This result demonstrates that

transfer learning with schema encoding improves

the performance of neural DST models and the

efﬁciency of few-shot learning on DST.

2 Related Work

Traditional DST models extract semantics by using

natural language understanding (NLU) modules to

generate dialogue states (Williams,2014;Wang

and Lemon,2013). The limitation of these models

is that they rely on features extracted by humans.

Recent work has focused on building end-to-end

DST models without hand-crafted features. Zhong

et al. (2018) use global modules to share parameters

between different slots. Nouri and Hosseini-Asl

(2018) improve the latency by removing inefﬁcient

recurrent layers. Transferable DST models that

can be adapted to new domains by removing the

dependency on the domain ontology are proposed

(Ren et al.,2018;Wu et al.,2019). Zhou and Small

(2019) attempt to solve DST as a question answer-

ing task using knowledge graph.

More recently, large-scale pre-trained language

models such as BERT (Devlin et al.,2019) and

GPT-2 (Radford et al.,2019) are used for DST. The

pre-trained BERT acts as an NLU module to un-

derstand utterances (Lee et al.,2019;Zhang et al.,

2020a;Kim et al.,2020;Heck et al.,2020). GPT-2

makes it possible to solve DST as a conditional

language modeling task (Hosseini-Asl et al.,2020;

Peng et al.,2021).

Rastogi et al. (2020) propose the baseline

method that deﬁnes the schema of dataset and uses

it for training and inference. A drawback of them

is that the calculation cost is high because they use

the domain ontology and access all values to es-

timate the dialogue state. DST models that uses

schema graphs to encode the relation between slots

and values are proposed (Chen et al.,2020;Zhu

et al.,2020). However, they focus on encoding

the relation between slots and values of the given

domains not on adaptation to new domains.

In this paper, we focus on making the model

learn how to understand the schema and how to

apply it to estimate the dialogue state, not just on

encoding the in-domain relation.

3 Schema Encoding for Transferable

Dialogue State Tracking

In this section, we describe the architecture of SET-

DST and how to optimize it. Figure 1shows the

overview of our method. The model consists of

the schema encoder and the state generator. SET-

DST generates the dialogue state in two steps: (a)

schema encoding and classiﬁcation, and (b) dia-

logue state generation. In this paper, we deﬁne

some terms as follows.

Schema

Metadata of the dataset, e.g., what do-

mains, services, slots, and intents the dataset covers.

A dataset has a schema that describes the dataset.

Domain

What domains the conversation goes on,

e.g., restaurant, hotel, and attraction. A conversa-

tion can go on multiple domains.

Service_name: Restaurants_1

Description: A leading provider for restaurant

search and reservations

Slot_name: restaurant_name

Description: Name of the restaurant

Slot_name: price_range

Description: Price range for the restaurant

...

Intent_name: ReserveRestaurant

Description: Reserve a table at a restaurant

Intent_name: FindRestaurants

Description: Find a restaurant of a particular cuisine

in a city

Figure 2: Example of schema for restaurant search and

reservation service including slots and intents.

Service

What services the system provides to

users. It is similar to domain, but application-level.

For example, restaurant domain can have two dif-

ferent services: (1) a service for searching and

reserving restaurants and (2) a service focused on

searching and comparing restaurants. In real world,

a service corresponds to an application.

Action

Abstract actions of users to achieve their

goals during conversations, e.g., to inform the sys-

tem their requirements or to request the system for

some information. Appendix Bdemonstrates the

details of the user actions covered in this paper.

Slot

The details of the user goals, e.g., the type

of food and the price range of hotel. Slots are

predeﬁned based on the domains or services that

the system should cover, and the slots are ﬁlled by

DST. The schema includes the information of slots.

Value

The values that have actual meaning for

the corresponding slots, e.g., cheap or expensive

about the price range of hotel. The systems should

match slot-value pairs from conversations.

Intent

Sub-goals to achieve the ﬁnal goals of

users. A goal consists of one or more intents, and

an intent is achieved over one or more conversation

turns. In real world, an intent corresponds to an

API. For example, to search restaurants or to book

hotels should be performed by APIs of external

systems. Furthermore, The dialogue system should

predict the slot-value pairs which correspond to

arguments to call APIs.

3.1 Schema Encoding

We use the pre-trained BERT

for the schema en-

coder. Figure 2shows an example of the schema

for Restaurant_1 service that is a service to search

and reserve restaurants. Services, slots, and intents

consist of name and short description. The name

and description of the service in the schema are fed

into BERT to generate service vector vRas

oR=BERT ([CLS]nR:dR[SEP])

vR=WR·o[CLS]

R∈Rh,(1)

where

is the service name,

is the service

description, and

is the hidden size.

o[CLS]

is the

output of

[CLS]

token, and

WR∈Rh×h

is a fully

connected (FC) layer.

[CLS]

and

[SEP]

are

special tokens that mean the start and end of the

sentence, respectively. The service in Figure 2can

be represented as

[CLS] Restaurants_1 :

A leading provider for restaurant

search and reservations [SEP]

to be

fed into BERT. The slots and intents in the schema

are also fed into BERT to generate slot vectors

VS={v1

S,· · · vNS

S} ∈ RNS×h

and intent vectors

VI={v1

I,· · · , vNI

I} ∈ RNI×h

, respectively, as

follows:

S=BERT [CLS]nj

S:dj

S[SEP]

S=WS·oj,[CLS]

S∈Rh, j ∈[1, NS]

,(2)

I=BERT [CLS]nk

I:dk

I[SEP]

I=WI·ok,[CLS]

I∈Rh, k ∈[1, NI]

.(3)

and

mean the number of slots and intents

for the service, respectively.

is the

-th slot

name, and

is the

-th slot description.

oj,[CLS]

is the output of

[CLS]

token from the

-th slot,

and

WS∈Rh×h

is an FC layer. Similarly,

is the

-th intent name, and

is the

-th intent

description.

ok,[CLS]

is the output of

[CLS]

token

from the

-th intent, and

WI∈Rh×h

is an FC

layer. The schema encoder takes

, and

to update the slot vectors

and intent vectors

The pre-trained models are available at

https://

github.com/huggingface/transformers.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SchemaEncodingforTransferableDialogueStateTrackingHyunminJeonComputerScienceandEngineeringPOSTECH,Pohang,SouthKoreajhm9507@postech.ac.krGaryGeunbaeLeeComputerScienceandEngineeringGraduateSchoolofArticialIntelligencePOSTECH,Pohang,SouthKoreagblee@postech.ac.krAbstractDialoguestatetracking(DST)isanes...

展开>> 收起<<

Schema Encoding for Transferable Dialogue State Tracking.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Schema Encoding for Transferable Dialogue State Tracking

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: