
Towards Generalizable and Robust Text-to-SQL Parsing∗
Chang Gao1, Bowen Li2, Wenxuan Zhang2, Wai Lam1†
, Binhua Li2,
Fei Huang2, Luo Si2and Yongbin Li2†
1The Chinese University of Hong Kong
2DAMO Academy, Alibaba Group
{gaochang,wlam}@se.cuhk.edu.hk, libowen.ne@gmail.com
{saike.zwx,binhua.lbh,shuide.lyb}@alibaba-inc.com
Abstract
Text-to-SQL parsing tackles the problem of
mapping natural language questions to exe-
cutable SQL queries. In practice, text-to-SQL
parsers often encounter various challenging
scenarios, requiring them to be generalizable
and robust. While most existing work ad-
dresses a particular generalization or robust-
ness challenge, we aim to study it in a more
comprehensive manner. In specific, we be-
lieve that text-to-SQL parsers should be (1)
generalizable at three levels of generaliza-
tion, namely i.i.d.,zero-shot, and composi-
tional, and (2) robust against input perturba-
tions. To enhance these capabilities of the
parser, we propose a novel TKK framework
consisting of Task decomposition, Knowledge
acquisition, and Knowledge composition to
learn text-to-SQL parsing in stages. By divid-
ing the learning process into multiple stages,
our framework improves the parser’s ability
to acquire general SQL knowledge instead of
capturing spurious patterns, making it more
generalizable and robust. Experimental re-
sults under various generalization and robust-
ness settings show that our framework is ef-
fective in all scenarios and achieves state-
of-the-art performance on the Spider, SParC,
and CoSQL datasets. Code can be found
at https://github.com/AlibabaResearch/
DAMO-ConvAI/tree/main/tkk.
1 Introduction
Text-to-SQL parsing aims to translate natural lan-
guage questions to SQL queries that can be exe-
cuted on databases to produce answers (Lin et al.,
2020), which bridges the gap between expert pro-
grammers and ordinary users who are not proficient
in writing SQL queries. Thus, it has drawn great
∗
Work done when Chang Gao was an intern at Alibaba.
The work described in this paper is substantially supported
by a grant from the Research Grant Council of the Hong
Kong Special Administrative Region, China (Project Code:
14204418).
†Corresponding authors.
attention in recent years (Zhong et al.,2017;Suhr
et al.,2020;Scholak et al.,2021;Hui et al.,2022;
Qin et al.,2022a,b).
Early work in this field (Zelle and Mooney,1996;
Yaghmazadeh et al.,2017;Iyer et al.,2017) mainly
focuses on i.i.d. generalization. They only use
a single database, and the exact same target SQL
query may appear in both the training and test sets.
However, it is difficult to collect sufficient training
data to cover all the questions users may ask (Gu
et al.,2021) and the predictions of test examples
might be obtained by semantic matching instead
of semantic parsing (Yu et al.,2018b), limiting the
generalization ability of parsers. Subsequent work
further focuses on generalizable text-to-SQL pars-
ing in terms of two aspects: zero-shot generaliza-
tion and compositional generalization. Zero-shot
generalization requires the parser to generalize to
unseen database schemas. Thanks to large-scale
datasets such as Spider (Yu et al.,2018b), SParC
(Yu et al.,2019b), and CoSQL (Yu et al.,2019a),
zero-shot generalization has been the most popu-
lar setting for text-to-SQL parsing in recent years.
Various methods involving designing graph-based
encoders (Wang et al.,2020;Cao et al.,2021) and
syntax tree decoders (Yu et al.,2018a;Rubin and
Berant,2021) have been developed to tackle this
challenge. Compositional generalization is the de-
sired ability to generalize to test examples consist-
ing of novel combinations of components observed
during training. Finegan-Dollak et al. (2018) ex-
plore compositional generalization in text-to-SQL
parsing focusing on template-based query splits.
Shaw et al. (2021) provide new splits of Spider
considering length, query template, and query com-
pound divergence to create challenging evaluations
of compositional generalization.
Another challenge of conducting text-to-SQL
parsing in practice is robustness. Existing text-to-
SQL models have been found vulnerable to input
perturbations (Deng et al.,2021;Gan et al.,2021a;
arXiv:2210.12674v1 [cs.CL] 23 Oct 2022