STAR SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing Zefeng Cai12

2025-05-03 0 0 905.83KB 13 页 10玖币
侵权投诉
STAR: SQL Guided Pre-Training for Context-dependent
Text-to-SQL Parsing
Zefeng Cai1,2,, Xiangyu Li1,2,, Binyuan Hui3, Min Yang2
, Bowen Li3,
Binhua Li3, Zheng Cao3, Weijie Li3, Fei Huang3, Luo Si3, Yongbin Li3
1University of Science and Technology of China
2Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
3DAMO Academy, Alibaba Group
{zf.cai, xy.li3, min.yang}@siat.ac.cn
{binyuan.hby, binhua.lbh, shuide.lyb}@alibaba-inc.com
Abstract
In this paper, we propose a novel SQL
guided pre-training framework STAR for
context-dependent text-to-SQL parsing, which
leverages contextual information to enrich
natural language (NL) utterance and table
schema representations for text-to-SQL con-
versations. Concretely, we propose two novel
pre-training objectives which respectively ex-
plore the context-dependent interactions of
NL utterances and SQL queries within each
text-to-SQL conversation: (i) schema state
tracking (SST) objective that tracks and ex-
plores the schema states of context-dependent
SQL queries in the form of schema-states
by predicting and updating the value of
each schema slot during interaction; (ii) ut-
terance dependency tracking (UDT) objec-
tive that employs weighted contrastive learn-
ing to pull together two semantically simi-
lar NL utterances and push away the repre-
sentations of semantically dissimilar NL utter-
ances within each conversation. In addition,
we construct a high-quality large-scale context-
dependent text-to-SQL conversation corpus to
pre-train STAR. Extensive experiments show
that STAR achieves new state-of-the-art per-
formance on two downstream benchmarks
(SPARC and COSQL), significantly outper-
forming previous pre-training methods and
ranking first on the leaderboard. We believe
the release of the constructed corpus, code-
base and pre-trained STAR checkpoints would
push forward the research in this area. For
reproducibility, we release our code and data
at https://github.com/AlibabaResearch/
DAMO-ConvAI/tree/main/star.
1 Introduction
Text-to-SQL parsing (Zhong et al.,2017;Yu et al.,
2018;Wang et al.,2022;Qin et al.,2022b) aims
to translate natural language (NL) questions into
Equal contribution.
Corresponding authors.
Dialogs SQLs Schema States
Can you show me campuses
in year 2000?
Can you also show me county
a6er year 2000?
Turn 1
Turn 2
Turn 3
Turn 4
SELECT campus FROM Campuses
WHERE year = 2000
SELECT campus, county FROM
Campuses WHERE year > 2000
X
slot value
Campuses.county NONE
What are the degrees on the
campuses list?
SELECT degrees FROM Campuses
JOIN Degrees
Which one in the university
conferred the least number in
year 2000 ?
SELECT degrees FROM Campuses
JOIN Degrees WHERE
Campuses.year = 2000 ORDER BY
Degrees.degrees LIMIT 1
Degrees.campus
WHERE =
NONE
······ ······
Degrees.degrees
Campuses.year
SELECT
Campuses.county NONE
Degrees.campus
WHERE =
SELECT
Degrees.degrees
Campuses.year
NONE
······ ······
Campuses.county SELECT
Degrees.campus
WHERE >
SELECT
Degrees.degrees
Campuses.year
NONE
······ ······
Campuses.county NONE
Degrees.campus
NONE
NONE
Degrees.degrees
Campuses.year
SELECT
······ ······
Database Schema
id
campus
county
year
year
campus
degrees
Degrees
Campuses
Figure 1: An example of cross-domain context-
dependent Text-to-SQL conversation. Here, each
database schema refers to the table/column names of
databases and each schema state refers to a slot-value
pair, whose slot is a column/table name (e.g., De-
grees.campus) and its value is a SQL keyword (e.g.,
SELECT). “x” indicates that the semantic/intent is
switched between Turn2 and Turn3 utterances.
executable SQL queries, which enables the users
who are unfamiliar with SQL to query databases
with natural language. Pre-trained language mod-
els (PLMs) have proved to be powerful in enhanc-
ing text-to-SQL parsing and yield impressive per-
formances, which benefit from the rich linguistic
knowledge in large-scale corpora. However, as re-
vealed in previous works (Yin et al.,2020;Yu et al.,
2021a;Qin et al.,2022a), there are intrinsic dis-
crepancy between the distributions of tables and
plain texts, leading to sub-optimal performances of
general PLMs such as BERT (Devlin et al.,2019),
ROBERTA (Liu et al.,2019), ELECTRA (Clark
et al.,2020). Recently, some studies (Yu et al.,
2021a,b;Shi et al.,2021;Deng et al.,2021;Liu
et al.,2021a,b) alleviate the above limitation by de-
signing tailored tabular language models (TaLMs)
for text-to-SQL parsing, which simultaneously en-
code NL questions and tables.
Despite the remarkable progress of previous
TaLMs, they still suffer from technical challenges
in the context-dependent setting.
First
, existing
TaLMs merely explore contextual information to
enrich utterance representations without consid-
arXiv:2210.11888v2 [cs.CL] 28 Oct 2022
ering the interaction states determined by history
SQL queries, which are relevant to the user in-
tent of current utterance. Nevertheless, the trace
and usage of historical SQL information can con-
tribute greatly to model the current SQL query, as
SQL conveys user intent in a compact and precise
manner. As shown in Figure 1, the second SQL
query is more likely to select the contents from
the “Compuses” table since the first SQL query
mentioned that table. Although tracking schema
states is essential to keep track of user requests
for context-dependent text-to-SQL parsing, how
to model, track and utilize schema states through-
out a conversation has not yet been explored in
previous TaLMs.
Second
, context-dependent text-
to-SQL parsing needs to effectively process context
information so as to help the system better parse
current NL utterance, since users may omit previ-
ously mentioned entities as well as constraints and
introduce substitutions to what has already been
stated. Taking Figure 1as an example, the sec-
ond utterance omit the implicit constraint of “cam-
puses in year 2000” as mentioned in the first utter-
ance. However, most prior TaLMs primarily model
stand-alone NL utterances without considering the
context-dependent interactions, which result in sub-
optimal performance. Although SCORE(Yu et al.,
2021b) model the turn contextual switch by pre-
dicting the context switch label between two con-
secutive user utterances, it ignores the complex
interactions of context utterances and cannot track
the dependence between distant utterances. For
instance, in Figure 1, SCOREfails to capture the
long term dependency between the first and the
fourth utterances since there is a switch between
the second and the third utterances.
In this paper, we propose a novel pre-training
framework STAR for context-dependent text-to-
SQL parsing, which explores the multi-turn inter-
actions of NL utterances and SQL queries within
each conversation, respectively.
First
, we propose
a schema state tracking (SST) objective to keep
track of SQL queries in the form of schema-states,
which predicts the value (a SQL keyword) of each
schema slot of the current SQL query given the
schema-state representation of previously predicted
SQL query. By introducing the schema-states to
represent SQL queries, we can better capture the
alignment between the the historical and current
SQL queries, especially for the long and complex
SQL queries.
Second
, we propose an utterance de-
pendency tracking (UDT) objective to capture com-
plex semantic dependency of sequential NL ques-
tions, which employs weighted contrastive learning
to pull together semantically similar NL utterances
and push away dissimilar NL utterances within
each conversation. A key insight is that the utter-
ance corresponding to similar SQL will be more
semantically relevant, as SQL is a highly structured
indication of user intent. Concretely, we propose
two novel similarity functions (SQL semantic simi-
larity and SQL structure similarity) to comprehen-
sively construct appropriate positive and negative
NL question pairs.
We summarize our main contributions as follows.
(1) To the best of our knowledge, we are the first to
propose a schema state tracking (SST) objective for
context-dependent TaLM, which tracks and updates
the schema states of the context-dependent SQL
queries in the form of schema states. (2) We pro-
pose an utterance dependency tracking (UDT) ob-
jective to capture complex semantic information of
sequential NL questions, which employs weighted
contrastive learning with two novel SQL-oriented
similarity functions to pull together two seman-
tically similar NL utterances and push away the
representations of dissimilar NL utterances within
each conversation. (3) We construct a high-quality
large-scale context-dependent text-to-SQL conver-
sation corpus to pre-train STAR. Experiments show
that STAR achieves new state-of-the-art perfor-
mance on two downstream benchmarks (SPARC
and COSQL) and ranking first on the leaderboard.
2 Task Definition
In this section, we first provide the formal task def-
inition for context-dependent text-to-SQL parsing.
Let
U={u1, . . . , uT}
denote the utterances in a
context-dependent text-to-SQL conversation with
T
turns, where
ui
represents the
i
-th NL question.
Each NL sentence
ui
contains
ni
tokens, denoted
as
ui= [w1, . . . , wni]
. In addition, there is a cor-
responding database schema
s
, which consists of
N
tables
{Ti}N
i=1
. The number of columns of all
tables in the schema is
m
. We use
si
to denote the
name of the
i
-th item in schema
s
. At current turn
t
, the goal of text-to-SQL parsing is to generate the
SQL query
ot
given the current utterance
ut
, histor-
ical utterances
{u1, . . . , ut1}
, schema
s
, and the
last predicted SQL query
ot1
. STAR primarily
consists of a stack of Transformer layer, which con-
verts a sequence of
L
input tokens
x= [x1, ..., xL]
!"#$ %&'&()
STAR
Last Schema State
(a) Utterance Dependency Tracking (b) Schema State Tracking
Current Schema State
pull pull
push
push
push
*;,-%&'&()-??-??
*N,-%&'&()-??-??
*O,-%&'&()-!"#$-/012-3$"45$6-107&0-89-".$-':2:)-;
*+,-%&'&()-".$-/012-3$"45$6-107&0-89-".$-':2:)-;
SQL Similarity
Structure Metric
Semantic Metric
SQL
Weighted Contrastive Learning
Similarity
Calculate
History Question
(u3u2u1)
Current Question
(u4)
"<<6$== >1>& ".$ 107&0-89-':2:)
!"#$ >1>& "<<6$== >1>& ".$ %&'&()-107&0-89-':2:)
?? ??
??
@('%A-B5"3-C=-5C=-".$-D-B5*-C=-35$-E*F!.$=3-3$"45$6-G-D-B5"3-?-D-B5*-?-@%&HA
u4 u3 u2 u1
%45$#"I=3"3$-%J*3
%45$#"I=3"3$-K"JF$
LF$=M*!
%L'
Figure 2: The overview of the proposed STAR framework consisting of two novel pre-training objectives: (a) the
utterance dependency tracking and (b) the schema state tracking. For brevity, we do not show the masked language
modeling objective here.
into a sequence of contextualized vector represen-
tations h= [h1,...,hL].
3 Pre-training Objectives
As illustrated in Figure 2, we propose two
novel pre-training objectives
SST
(
S
chema
S
tate
T
racking) and
UDT
(
U
tterance
D
ependency
T
racking) to explore the complex context interac-
tions of NL utterances and SQL queries within each
text-to-SQL conversation, respectively. In addi-
tion, we also employ the
MLM
(
M
asked
L
anguage
M
odeling) objective to help learn better contextual
representations of the conversations. Next, we will
introduce the pre-training objectives in detail.
3.1 Schema State Tracking
The usage of context SQL information contributes
greatly to model the current SQL query. Inspired
by the dialogue state tracking (Ouyang et al.,2020;
Wang et al.,2021a) which keeps track of user in-
tentions in the form of a set of dialogue states (i.e. ,
slot-value pairs) in task-oriented dialogue systems,
we propose a schema state tracking (SST) objective
in a self-supervised manner to keep track of schema
states (or user requests) of context-dependent SQL
queries, which aims to predict the values of the
schema slots. Concretely, we track the interaction
states of the text-to-SQL conversation in the form
of schema-states whose slots are column names
of all tables in the schema and their values are
from SQL keywords. Taking the SQL query in
Figure 3as example, the value of the schema slot
[cars_data] is the SQL keyword [SELECT].
Formally, we first convert the last predicted SQL
query
ot1
into a set of schema states. Since
the names of schema states are names of all
schema, the values of those schema states that
do not appear in the last SQL query
ot1
are
set to
[NONE]
, as shown in Figure 3. We repre-
sent the SQL query
ot1
with
m
schema-states
{(si
t1, vi
t1)}m
i=1
, where
si
t1
denotes the schema-
state slot,
vi
t1
denotes the schema-state value of
the slot
si
t1
, and
m
represents the number of
schema. At the
t
-th turn, the goal of SST is to
predict the value
vi
t
of each schema-state slot
si
t
of
the
t
-th SQL query given all the history utterances
{u1, . . . , ut1}
, the current utterance
ut
and the
schema-states
{(si
t1, vi
t1)}m
i=1
of the late query
ot1
. That is, at the
t
-th turn, the input
It
of the
SST task is as:
It={u1, . . . , ut};{(si
t1, vi
t1)}m
i=1(1)
Note that the SQL queries within a conversation
share the same schema
s
, thus the schema-states of
the
t
-th and
t1
-th SQL queries have the same
schema-state slots (i.e., si
t1=si
t=si).
Since each schema state
ci
t1= (si
t1, vi
t1)
contains multiple words, we apply an attentive
layer to obtain the representation of
ci
t1=
(si
t1, vi
t1)
. Concretely, given the output contextu-
alized representation
hci
t1
t= [hl
t,...,hl+|ci
t1|−1
t]
(
l
is the start index of
ci
t1
) of each schema state
ci
t1
, the attentive schema-state representation
ci
t1
摘要:

STAR:SQLGuidedPre-TrainingforContext-dependentText-to-SQLParsingZefengCai1;2;,XiangyuLi1;2;,BinyuanHui3,MinYang2y,BowenLi3,BinhuaLi3,ZhengCao3,WeijieLi3,FeiHuang3,LuoSi3,YongbinLi3y1UniversityofScienceandTechnologyofChina2ShenzhenInstituteofAdvancedTechnology,ChineseAcademyofSciences3DAMOAcademy,Ali...

展开>> 收起<<
STAR SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing Zefeng Cai12.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:905.83KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注