Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Qi Liu1 Zihuiwen Ye2 Tao Yu1 Phil Blunsom2 Linfeng Song3 1The University of Hong Kong2University of Oxford3Tencent AI Lab Bellevue WA USA

2025-05-02 0 0 584.43KB 13 页 10玖币

侵权投诉

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play

Qi Liu1∗

, Zihuiwen Ye2∗

, Tao Yu1, Phil Blunsom2, Linfeng Song3

1The University of Hong Kong, 2University of Oxford, 3Tencent AI Lab, Bellevue, WA, USA

{liuqi, tyu}@cs.hku.hk;{zihuiwen.ye, phil.blunsom}@cs.ox.ac.uk;

lfsong@tencent.com

Abstract

The task of context-dependent text-to-SQL

aims to convert multi-turn user utterances to

formal SQL queries. This is a challenging task

due to both the scarcity of training data from

which to learn complex contextual dependen-

cies and to generalize to unseen databases. In

this paper we explore augmenting the training

datasets using self-play, which leverages con-

textual information to synthesize new interac-

tions to adapt the model to new databases. We

ﬁrst design a SQL-to-text model conditioned

on a sampled goal query, which represents a

user’s intent, that then converses with a text-

to-SQL semantic parser to generate new inter-

actions. We then ﬁlter the synthesized inter-

actions and retrain the models with the aug-

mented data. We ﬁnd that self-play improves

the accuracy of a strong baseline on SParC

and CoSQL, two widely used cross-domain

text-to-SQL datasets. Our analysis shows that

self-play simulates various conversational the-

matic relations, enhances cross-domain gener-

alization and improves beam-search.1

1 Introduction

Multi-turn text-to-SQL translation is a powerful se-

mantic parsing paradigm that converts natural lan-

guage user utterances into executable SQL queries

in a conversational environment. Compared to reg-

ular text-to-SQL tasks such as Spider (Yu et al.,

2018b) and GeoQuery (Zelle and Mooney,1996),

conversational text-to-SQL requires interpreting

coreference and omission phenomena that fre-

quently appear in human conversations. To be

effective, text-to-SQL models must uncover com-

plex contextual dependencies while grounding user

utterances in task speciﬁc database schemas.

Numerous architectures and pretraining methods

have been proposed for tackling context-dependent

∗Equal Contribution

Our code is available at:

https://github.com/

leuchine/self_play_picard

text-to-SQL (Suhr et al.,2018;Zhang et al.,2019;

Hui et al.,2021;Scholak et al.,2021;Yu et al.,

2021;Xie et al.,2022). However, the size of the

datasets used has been limited due to the high

cost of annotating multi-turn dialogue and SQL

pairs, which often requires trained experts. Exist-

ing multi-turn text-to-SQL datasets, such as SParC

(Yu et al.,2019b) and CoSQL (Yu et al.,2019a),

require text-to-SQL parsers to generalize to unseen

databases at test time, but doing so is difﬁcult with

limited training context.

In this paper we propose the use of self-play to

augment multi-turn text-to-SQL datasets in order to

achieve more robust generalization. Self-play simu-

lates interactions between multiple artiﬁcial agents

in order to generate a training signal in addition to

supervised data. It has been successfully applied

in a wide range of tasks, e.g. board games (Silver

et al.,2016,2018) and multiplayer battle games

(Vinyals et al.,2019;Berner et al.,2019). It has

also been applied in dialogue simulations, during

which a dialogue model converses with a user sim-

ulator to generate synthetic dialogues (Schatzmann

et al.,2006;Gür et al.,2018;Tseng et al.,2021). In

our work, we extend self-play to semantic parsing.

Although self-play has been adopted in task-

oriented dialogue, the need to pre-deﬁne a do-

main speciﬁc ontology of slot-value pairs (e.g. the

slot value “price=expensive” for a restaurant book-

ing) (Henderson et al.,2014;Wen et al.,2016;

Budzianowski et al.,2018) prevents self-play from

simulating interactions in a new domain. Adding

a new domain for task-oriented dialogue is difﬁ-

cult and labor-intensive. On the other hand, text-

to-SQL tasks (Yu et al.,2018b,2019b,a) use a

domain-independent formalism, i.e. SQL queries.

We demonstrate that self-play is well-suited to

simulating interactions in a new domain given a

database schema, improving cross-domain general-

ization.

We use PICARD (Scholak et al.,2021) as the

arXiv:2210.12096v1 [cs.CL] 21 Oct 2022

Interaction

SELECT * FROM AIRLINES

WHERE Airline =

"JetBlue Airways"

SELECT Country FROM

AIRLINES WHERE Airline

= "JetBlue Airways"

User: What are all the

airlines?

User: Of these, which is

Jetblue Airways?

User: What is the country

corresponding to it?

Airlines : id | airline | abbreviation ...

Airports : city | airport_code | airport_name ...

...

Flights : airline | flight_no | source_airport ...

Database

Gold

Interactions

Text-to-SQL Model

SQL-to-Text Model

Synthetic

Interactions

Filtered

Interactions

Train Self-Play

Retrain

Sampled

Logical Form

Filter

Utterances Formal Programs

Input: current utterance | database | previous utterances

Of these, which is

Jetblue Airways?

flight_2 | airlines : id,

airline,... | airports:

Output: the next SQL

What are all the

airlines?

SELECT * FROM AIRLINES WHERE Airline = "JetBlue Airways"

Output: the next utterance

What is the country corresponding to it?

Input: user goal | previous utterances | last SQL query | database

SELECT Country

FROM AIRLINES

WHERE Airline =

"JetBlue Airways"

Of these, which is Jetblue

Airways? What are all the

airlines?

SELECT * FROM

AIRLINES WHERE

Airline = "JetBlue

flight_2 | airlines : id,

airline,... | airports:

Figure 1: Multi-turn text-to-SQL with self-play. We transform an interaction from SParC on the left to seq2seq

formats (top: text-to-SQL, bottom: SQL-to-text). User utterances, SQL queries, databases, and user goals are

concatenated with a “ | ” symbol and shown in green, blue, yellow, and purple respectively. We use self-play

to generate synthetic interactions. The synthetic interactions are ﬁltered and used to retrain the text-to-SQL and

SQL-to-text models.

base of our text-to-SQL model. When generating a

new interaction, we ﬁrst sample a SQL query with

Zhong et al. (2021) as the goal query and condi-

tion the SQL-to-text model on this sampled SQL.

The text-to-SQL model converses with the SQL-to-

text model to simulate a new interaction. We ﬁlter

out the interactions that are not grounded to the

sampled goals and employ self-training (Yarowsky,

1995;Zoph et al.,2020) to retrain the text-to-SQL

model and the SQL-to-text model. We conduct

extensive experiments on SParC and CoSQL. Our

main ﬁndings are:

•

Self-play helps the text-to-SQL model learn

various conversational thematic relations

(§5.3) and improves cross-domain generaliza-

tion (§5.1).

•

Self-play improves the performance on the

majority of SQL types. Models after self-

play perform particularly well on queries of

medium difﬁculty (§5.1).

•

Self-play improves beam search. Models after

self-play are less sensitive to the beam size

and can perform well with even small beam

sizes (§5.2).

2 Preliminary

In this section, we formally deﬁne the multi-

turn text-to-SQL task and introduce the PICARD

(Scholak et al.,2021) model, which we use as our

baseline. PICARD obtains state-of-the-art results

on several text-to-SQL tasks.

2.1 Task Deﬁnition

In context-dependent text-to-SQL tasks, we are

given interactions between a user and a system.

Each interaction spans multiple turns. The user

ends the interaction when the query returns the

required information from the database. Formally,

at each turn

(where

1≤t≤T

), multi-turn text-

to-SQL produces a valid and executable SQL query

given a database

, a current user utterance

and a dialogue context

(which is usually the

previous user utterances U<t):

p(Qt| Ut,Ct,D).(1)

2.2 Baseline: PICARD

We use PICARD (Scholak et al.,2021) as our

baseline conditional model for Equation 1. PI-

CARD serializes the database schema

into a

sequence following Lin et al. (2020). An example

of the input and output format is shown in Figure

1. PICARD ﬁnetunes T5 (Raffel et al.,2019), a

sequence-to-sequence transformer, with input and

output sequences. PICARD proposes an incremen-

tal parsing method for constrained decoding during

beam search. Speciﬁcally, it rejects inadmissible

tokens at each beam search step subject to pars-

ing rules that encode lexical and grammatical con-

straints. Only the beam hypotheses that pass all

the constraint checks are kept. PICARD also lever-

ages SQL schema information, such as the column

names of each table, to impose checks on the valid-

ity of the generated SQL. PICARD greatly reduces

the likelihood of decoding invalid SQL queries.

3 Method

Here we introduce how we use self-play for data

augmentation. We ﬁrst design a SQL-to-text model

(§3.1). Next, we describe how to use self-play to

generate synthetic interactions (§3.2). Finally, we

explain how we incorporate the generated data for

self-training (§3.3).

3.1 The SQL-to-Text Model

We design a user simulator, which is a SQL-to-text

model, to converse with the text-to-SQL model

to generate synthetic interactions. Speciﬁcally, at

each turn

we would like the user simulator to pro-

duce a meaningful question that would naturally

be asked by a human user. In each interaction, a

user has a goal to achieve. We explicitly condition

the SQL-to-text model on a user goal,

, to en-

courage the user simulator to ask questions that are

grounded to this goal. Formally, the SQL-to-text

model calculates the following conditional at each

turn:

p(Ut| Qt−1,Ct,G,D),(2)

where the context

contains the previous user ut-

terances U<t. During training, Gis the SQL query

of the ﬁnal turn

, i.e.

. During inference we

adopt Zhong et al. (2021) to sample a new goal

query as shown in §3.2. We employ the seq2seq

approach and parameterize the SQL-to-text model

(Eq. 2) with T5. We concatenate the user goal

the last SQL query

Qt−1

, the previous user utter-

ances

U<t

, and the serialized schema

to predict

the next user utterance

. For example, one input

would be: “user goal

previous utterances

last

SQL query

serialized database”. Its target label is

the correct user utterance for the next turn. We pad

the last utterance with a special stop-of-interaction

symbol. In SQL-to-text, there could be multiple

reasonable questions to ask for the next turn, i.e.

a one-to-many relation. A well-trained SQL-to-

text model can generate new questions, thereby

increasing the diversity of user dialogue ﬂows in

the dataset and improving generalization.

Algorithm 2: Self-training.

Input :Gold interactions I, # iteration kfor

synthetic data generation, threashold w.

Output :A text-to-SQL model and a SQL-to-text

model.

Pretrain a text-to-SQL model p(Qt|Ut,Ct,D)and a

SQL-to-text model p(Ut|Qt−1,Ct,G,D)on I.

I0=∅

for iin (1, ..., k)do

Sample a goal query G.

Generate a synthetic interaction ISby self-play

between text-to-SQL and SQL-to-text.

Calculate score(QT,G)on IS.

if score(QT,G)> w then

Add ISto I0

Retrain p(Qt|Ut,Ct,D)and p(Ut|Qt−1,Ct,G,D)

on I ∪ I0.

return the retrained text-to-SQL model and the

SQL-to-text model.

3.2 Self-Play

We pretrain both the text-to-SQL and SQL-to-text

models on the gold training data by minimizing the

negative log likelihood:

L=−

i=1

j=1

log p(yi

j|yi

1, yi

2, . . . , yi

j−1),(3)

where

is the number of training examples,

the sequence length, and each

is a token in the

reference sequence. With the models pretrained

on the gold dialogues, we can generate synthetic

interactions using self-play. First, we need to spec-

ify a SQL query as the eventual goal

of the in-

teraction. We adopt the query sampling method

proposed in Zhong et al. (2021) for synthesizing

a goal

.Zhong et al. (2021) ﬁrst builds and sam-

ples coarse SQL templates with the SQLs in the

training set by replacing the column and value men-

tions in the queries with typed slots. For example,

SELECT T1.id, T2.name

is converted to the tem-

plate

SELECT key1, text1

. To adapt the models

to an unseen environment, they sample an unseen

database and ﬁll in the typed slots with columns and

values from the sampled database to form a new

SQL query. We follow this approach to synthesize

goals in new domains for cross-domain generaliza-

tion. The complete sampling procedure is given in

Appendix A.1. We concatenate the sampled goal

with an empty context and the serialized schema

as shown in Eq. 2and feed it into the SQL-to-text

model to produce the ﬁrst user utterance. Then,

the text-to-SQL model and SQL-to-text model can

continue the interaction with Eq. 1and Eq. 2until

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AugmentingMulti-TurnText-to-SQLDatasetswithSelf-PlayQiLiu1,ZihuiwenYe2,TaoYu1,PhilBlunsom2,LinfengSong31TheUniversityofHongKong,2UniversityofOxford,3TencentAILab,Bellevue,WA,USA{liuqi,tyu}@cs.hku.hk;{zihuiwen.ye,phil.blunsom}@cs.ox.ac.uk;lfsong@tencent.comAbstractThetaskofcontext-dependenttext-to-...

展开>> 收起<<

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Qi Liu1 Zihuiwen Ye2 Tao Yu1 Phil Blunsom2 Linfeng Song3 1The University of Hong Kong2University of Oxford3Tencent AI Lab Bellevue WA USA.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Qi Liu1 Zihuiwen Ye2 Tao Yu1 Phil Blunsom2 Linfeng Song3 1The University of Hong Kong2University of Oxford3Tencent AI Lab Bellevue WA USA

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: