Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Qi Liu1 Zihuiwen Ye2 Tao Yu1 Phil Blunsom2 Linfeng Song3 1The University of Hong Kong2University of Oxford3Tencent AI Lab Bellevue WA USA

2025-05-02 0 0 584.43KB 13 页 10玖币
侵权投诉
Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play
Qi Liu1
, Zihuiwen Ye2
, Tao Yu1, Phil Blunsom2, Linfeng Song3
1The University of Hong Kong, 2University of Oxford, 3Tencent AI Lab, Bellevue, WA, USA
{liuqi, tyu}@cs.hku.hk;{zihuiwen.ye, phil.blunsom}@cs.ox.ac.uk;
lfsong@tencent.com
Abstract
The task of context-dependent text-to-SQL
aims to convert multi-turn user utterances to
formal SQL queries. This is a challenging task
due to both the scarcity of training data from
which to learn complex contextual dependen-
cies and to generalize to unseen databases. In
this paper we explore augmenting the training
datasets using self-play, which leverages con-
textual information to synthesize new interac-
tions to adapt the model to new databases. We
first design a SQL-to-text model conditioned
on a sampled goal query, which represents a
user’s intent, that then converses with a text-
to-SQL semantic parser to generate new inter-
actions. We then filter the synthesized inter-
actions and retrain the models with the aug-
mented data. We find that self-play improves
the accuracy of a strong baseline on SParC
and CoSQL, two widely used cross-domain
text-to-SQL datasets. Our analysis shows that
self-play simulates various conversational the-
matic relations, enhances cross-domain gener-
alization and improves beam-search.1
1 Introduction
Multi-turn text-to-SQL translation is a powerful se-
mantic parsing paradigm that converts natural lan-
guage user utterances into executable SQL queries
in a conversational environment. Compared to reg-
ular text-to-SQL tasks such as Spider (Yu et al.,
2018b) and GeoQuery (Zelle and Mooney,1996),
conversational text-to-SQL requires interpreting
coreference and omission phenomena that fre-
quently appear in human conversations. To be
effective, text-to-SQL models must uncover com-
plex contextual dependencies while grounding user
utterances in task specific database schemas.
Numerous architectures and pretraining methods
have been proposed for tackling context-dependent
Equal Contribution
1
Our code is available at:
https://github.com/
leuchine/self_play_picard
text-to-SQL (Suhr et al.,2018;Zhang et al.,2019;
Hui et al.,2021;Scholak et al.,2021;Yu et al.,
2021;Xie et al.,2022). However, the size of the
datasets used has been limited due to the high
cost of annotating multi-turn dialogue and SQL
pairs, which often requires trained experts. Exist-
ing multi-turn text-to-SQL datasets, such as SParC
(Yu et al.,2019b) and CoSQL (Yu et al.,2019a),
require text-to-SQL parsers to generalize to unseen
databases at test time, but doing so is difficult with
limited training context.
In this paper we propose the use of self-play to
augment multi-turn text-to-SQL datasets in order to
achieve more robust generalization. Self-play simu-
lates interactions between multiple artificial agents
in order to generate a training signal in addition to
supervised data. It has been successfully applied
in a wide range of tasks, e.g. board games (Silver
et al.,2016,2018) and multiplayer battle games
(Vinyals et al.,2019;Berner et al.,2019). It has
also been applied in dialogue simulations, during
which a dialogue model converses with a user sim-
ulator to generate synthetic dialogues (Schatzmann
et al.,2006;Gür et al.,2018;Tseng et al.,2021). In
our work, we extend self-play to semantic parsing.
Although self-play has been adopted in task-
oriented dialogue, the need to pre-define a do-
main specific ontology of slot-value pairs (e.g. the
slot value “price=expensive” for a restaurant book-
ing) (Henderson et al.,2014;Wen et al.,2016;
Budzianowski et al.,2018) prevents self-play from
simulating interactions in a new domain. Adding
a new domain for task-oriented dialogue is diffi-
cult and labor-intensive. On the other hand, text-
to-SQL tasks (Yu et al.,2018b,2019b,a) use a
domain-independent formalism, i.e. SQL queries.
We demonstrate that self-play is well-suited to
simulating interactions in a new domain given a
database schema, improving cross-domain general-
ization.
We use PICARD (Scholak et al.,2021) as the
arXiv:2210.12096v1 [cs.CL] 21 Oct 2022
Interaction
SELECT * FROM AIRLINES
SELECT * FROM AIRLINES
WHERE Airline =
"JetBlue Airways"
SELECT Country FROM
AIRLINES WHERE Airline
= "JetBlue Airways"
User: What are all the
airlines?
User: Of these, which is
Jetblue Airways?
User: What is the country
corresponding to it?
Airlines : id | airline | abbreviation ...
Airports : city | airport_code | airport_name ...
...
Flights : airline | flight_no | source_airport ...
Database
Gold
Interactions
Text-to-SQL Model
SQL-to-Text Model
Synthetic
Interactions
Filtered
Interactions
Train Self-Play
Retrain
Retrain
Sampled
Logical Form
Filter
Utterances Formal Programs
Input: current utterance | database | previous utterances
Of these, which is
Jetblue Airways?
flight_2 | airlines : id,
airline,... | airports:
Output: the next SQL
What are all the
airlines?
SELECT * FROM AIRLINES WHERE Airline = "JetBlue Airways"
Output: the next utterance
What is the country corresponding to it?
Input: user goal | previous utterances | last SQL query | database
SELECT Country
FROM AIRLINES
WHERE Airline =
"JetBlue Airways"
Of these, which is Jetblue
Airways? What are all the
airlines?
SELECT * FROM
AIRLINES WHERE
Airline = "JetBlue
flight_2 | airlines : id,
airline,... | airports:
Figure 1: Multi-turn text-to-SQL with self-play. We transform an interaction from SParC on the left to seq2seq
formats (top: text-to-SQL, bottom: SQL-to-text). User utterances, SQL queries, databases, and user goals are
concatenated with a “ | ” symbol and shown in green, blue, yellow, and purple respectively. We use self-play
to generate synthetic interactions. The synthetic interactions are filtered and used to retrain the text-to-SQL and
SQL-to-text models.
base of our text-to-SQL model. When generating a
new interaction, we first sample a SQL query with
Zhong et al. (2021) as the goal query and condi-
tion the SQL-to-text model on this sampled SQL.
The text-to-SQL model converses with the SQL-to-
text model to simulate a new interaction. We filter
out the interactions that are not grounded to the
sampled goals and employ self-training (Yarowsky,
1995;Zoph et al.,2020) to retrain the text-to-SQL
model and the SQL-to-text model. We conduct
extensive experiments on SParC and CoSQL. Our
main findings are:
Self-play helps the text-to-SQL model learn
various conversational thematic relations
5.3) and improves cross-domain generaliza-
tion (§5.1).
Self-play improves the performance on the
majority of SQL types. Models after self-
play perform particularly well on queries of
medium difficulty (§5.1).
Self-play improves beam search. Models after
self-play are less sensitive to the beam size
and can perform well with even small beam
sizes (§5.2).
2 Preliminary
In this section, we formally define the multi-
turn text-to-SQL task and introduce the PICARD
(Scholak et al.,2021) model, which we use as our
baseline. PICARD obtains state-of-the-art results
on several text-to-SQL tasks.
2.1 Task Definition
In context-dependent text-to-SQL tasks, we are
given interactions between a user and a system.
Each interaction spans multiple turns. The user
ends the interaction when the query returns the
required information from the database. Formally,
at each turn
t
(where
1tT
), multi-turn text-
to-SQL produces a valid and executable SQL query
Qt
given a database
D
, a current user utterance
Ut
,
and a dialogue context
Ct
(which is usually the
previous user utterances U<t):
p(Qt| Ut,Ct,D).(1)
2.2 Baseline: PICARD
We use PICARD (Scholak et al.,2021) as our
baseline conditional model for Equation 1. PI-
CARD serializes the database schema
D
into a
sequence following Lin et al. (2020). An example
of the input and output format is shown in Figure
1. PICARD finetunes T5 (Raffel et al.,2019), a
sequence-to-sequence transformer, with input and
output sequences. PICARD proposes an incremen-
tal parsing method for constrained decoding during
beam search. Specifically, it rejects inadmissible
tokens at each beam search step subject to pars-
ing rules that encode lexical and grammatical con-
straints. Only the beam hypotheses that pass all
the constraint checks are kept. PICARD also lever-
ages SQL schema information, such as the column
names of each table, to impose checks on the valid-
ity of the generated SQL. PICARD greatly reduces
the likelihood of decoding invalid SQL queries.
3 Method
Here we introduce how we use self-play for data
augmentation. We first design a SQL-to-text model
3.1). Next, we describe how to use self-play to
generate synthetic interactions (§3.2). Finally, we
explain how we incorporate the generated data for
self-training (§3.3).
3.1 The SQL-to-Text Model
We design a user simulator, which is a SQL-to-text
model, to converse with the text-to-SQL model
to generate synthetic interactions. Specifically, at
each turn
t
we would like the user simulator to pro-
duce a meaningful question that would naturally
be asked by a human user. In each interaction, a
user has a goal to achieve. We explicitly condition
the SQL-to-text model on a user goal,
G
, to en-
courage the user simulator to ask questions that are
grounded to this goal. Formally, the SQL-to-text
model calculates the following conditional at each
turn:
p(Ut| Qt1,Ct,G,D),(2)
where the context
Ct
contains the previous user ut-
terances U<t. During training, Gis the SQL query
of the final turn
T
, i.e.
QT
. During inference we
adopt Zhong et al. (2021) to sample a new goal
query as shown in §3.2. We employ the seq2seq
approach and parameterize the SQL-to-text model
(Eq. 2) with T5. We concatenate the user goal
G
,
the last SQL query
Qt1
, the previous user utter-
ances
U<t
, and the serialized schema
D
to predict
the next user utterance
Ut
. For example, one input
would be: “user goal
|
previous utterances
|
last
SQL query
|
serialized database”. Its target label is
the correct user utterance for the next turn. We pad
the last utterance with a special stop-of-interaction
symbol. In SQL-to-text, there could be multiple
reasonable questions to ask for the next turn, i.e.
a one-to-many relation. A well-trained SQL-to-
text model can generate new questions, thereby
increasing the diversity of user dialogue flows in
the dataset and improving generalization.
Algorithm 2: Self-training.
Input :Gold interactions I, # iteration kfor
synthetic data generation, threashold w.
Output :A text-to-SQL model and a SQL-to-text
model.
Pretrain a text-to-SQL model p(Qt|Ut,Ct,D)and a
SQL-to-text model p(Ut|Qt1,Ct,G,D)on I.
I0=
for iin (1, ..., k)do
Sample a goal query G.
Generate a synthetic interaction ISby self-play
between text-to-SQL and SQL-to-text.
Calculate score(QT,G)on IS.
if score(QT,G)> w then
Add ISto I0
Retrain p(Qt|Ut,Ct,D)and p(Ut|Qt1,Ct,G,D)
on I ∪ I0.
return the retrained text-to-SQL model and the
SQL-to-text model.
3.2 Self-Play
We pretrain both the text-to-SQL and SQL-to-text
models on the gold training data by minimizing the
negative log likelihood:
L=
N
X
i=1
V
X
j=1
log p(yi
j|yi
1, yi
2, . . . , yi
j1),(3)
where
N
is the number of training examples,
V
is
the sequence length, and each
yi
j
is a token in the
reference sequence. With the models pretrained
on the gold dialogues, we can generate synthetic
interactions using self-play. First, we need to spec-
ify a SQL query as the eventual goal
G
of the in-
teraction. We adopt the query sampling method
proposed in Zhong et al. (2021) for synthesizing
a goal
G
.Zhong et al. (2021) first builds and sam-
ples coarse SQL templates with the SQLs in the
training set by replacing the column and value men-
tions in the queries with typed slots. For example,
SELECT T1.id, T2.name
is converted to the tem-
plate
SELECT key1, text1
. To adapt the models
to an unseen environment, they sample an unseen
database and fill in the typed slots with columns and
values from the sampled database to form a new
SQL query. We follow this approach to synthesize
goals in new domains for cross-domain generaliza-
tion. The complete sampling procedure is given in
Appendix A.1. We concatenate the sampled goal
G
with an empty context and the serialized schema
as shown in Eq. 2and feed it into the SQL-to-text
model to produce the first user utterance. Then,
the text-to-SQL model and SQL-to-text model can
continue the interaction with Eq. 1and Eq. 2until
摘要:

AugmentingMulti-TurnText-to-SQLDatasetswithSelf-PlayQiLiu1,ZihuiwenYe2,TaoYu1,PhilBlunsom2,LinfengSong31TheUniversityofHongKong,2UniversityofOxford,3TencentAILab,Bellevue,WA,USA{liuqi,tyu}@cs.hku.hk;{zihuiwen.ye,phil.blunsom}@cs.ox.ac.uk;lfsong@tencent.comAbstractThetaskofcontext-dependenttext-to-...

展开>> 收起<<
Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Qi Liu1 Zihuiwen Ye2 Tao Yu1 Phil Blunsom2 Linfeng Song3 1The University of Hong Kong2University of Oxford3Tencent AI Lab Bellevue WA USA.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:584.43KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注