Unveiling the Black Box of PLMs with Semantic Anchors Towards Interpretable Neural Semantic Parsing Lunyiu Nie1 Jiuding Sun1 Yanlin Wang2 Lun Du3

2025-05-06 0 0 1.72MB 9 页 10玖币
侵权投诉
Unveiling the Black Box of PLMs with Semantic Anchors:
Towards Interpretable Neural Semantic Parsing
Lunyiu Nie1*, Jiuding Sun1*, Yanlin Wang2, Lun Du3,
Lei Hou1, Juanzi Li1, Shi Han3, Dongmei Zhang3, Jidong Zhai1
1Department of Computer Science and Technology, Tsinghua University
2School of Software Engineering, Sun Yat-sen University 3Microsoft Research Asia
{nlx20, sjd22}@mails.tsinghua.edu.cn, wangylin36@mail.sysu.edu.cn,
{lun.du, shihan, dongmeiz}@microsoft.com, {houlei,lijuanzi, zhaijidong}@tsinghua.edu.cn
Abstract
The recent prevalence of pretrained language models (PLMs)
has dramatically shifted the paradigm of semantic parsing,
where the mapping from natural language utterances to struc-
tured logical forms is now formulated as a Seq2Seq task. De-
spite the promising performance, previous PLM-based ap-
proaches often suffer from hallucination problems due to
their negligence of the structural information contained in
the sentence, which essentially constitutes the key seman-
tics of the logical forms. Furthermore, most works treat PLM
as a black box in which the generation process of the target
logical form is hidden beneath the decoder modules, which
greatly hinders the model’s intrinsic interpretability. To ad-
dress these two issues, we propose to incorporate the cur-
rent PLMs with a hierarchical decoder network. By taking the
first-principle structures as the semantic anchors, we propose
two novel intermediate supervision tasks, namely Semantic
Anchor Extraction and Semantic Anchor Alignment, for train-
ing the hierarchical decoders and probing the model inter-
mediate representations in a self-adaptive manner alongside
the fine-tuning process. We conduct intensive experiments on
several semantic parsing benchmarks and demonstrate that
our approach can consistently outperform the baselines. More
importantly, by analyzing the intermediate representations of
the hierarchical decoders, our approach also makes a huge
step toward the intrinsic interpretability of PLMs in the do-
main of semantic parsing.
1 Introduction
Semantic parsing refers to the task of converting natural
language utterances into machine-executable logical forms
(Kamath and Das 2019). With the rise of pretrained lan-
guage models (PLMs) in natural language processing, most
recent works in the field formulate semantic parsing as a
Seq2Seq task and develop neural semantic parsers on top of
*These authors contributed equally.
Work done during internship at Microsoft Research Asia.
Yanlin Wang is the corresponding author. Work done during
the author’s employment at Microsoft Research Asia.
Copyright © 2023, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
the latest PLMs like T5 (Raffel et al. 2020), BART (Lewis
et al. 2020), and GPT-3 (Brown et al. 2020), which sig-
nificantly reduces the manual effort needed in designing
compositional grammars (Liang, Jordan, and Klein 2011;
Zettlemoyer and Collins 2005). By leveraging the extensive
knowledge learned from the pretrain corpus, these PLM-
based models exhibit strong performance in comprehending
the semantics underlying the source natural language utter-
ance and generating the target logical form that adheres to
specific syntactic structures (Shin and Van Durme 2022; Yin
et al. 2022).
Despite the promising performance, current PLM-based
approaches most regard both input and output as plain text
sequences and neglect the structural information contained
in the sentences (Yin et al. 2020; Shi et al. 2021), such as
the database (DB) or knowledge base (KB) schema that es-
sentially constitutes the key semantics of the target SQL or
SPARQL logical forms. As a result, these PLM-based mod-
els often suffer from the hallucination issue (Ji et al. 2022)
and may generate incorrect logical form structures that are
unfaithful to the input utterance (Nicosia, Qu, and Altun
2021; Gupta et al. 2022). For example, as shown in Figure 1,
the PLM mistakenly generates a relationship “product” in
the SPARQL query, which is contradictory to the “company
produced” mentioned in the natural language.
To prevent the PLMs from generating hallucinated struc-
tures, many works propose execution-guided decoding
strategies (Wang et al. 2018; Wang, Lapata, and Titov 2021;
Ren et al. 2021) and grammar-constrained decoding algo-
rithms (Shin et al. 2021; Scholak, Schucher, and Bahdanau
2021). However, manipulating the decoding process with
conditional branches can significantly slow down the model
inference (Post and Vilar 2018; Hui et al. 2021). More im-
portantly, in these methods, the DB/KB schema is employed
extrinsically as a posteriori correction afterward the model
fine-tuning, whereas the inherent ignorance of logical form
structures still remains unsolved in the PLMs.
Therefore, another concurrent line of work further
pretrains the PLMs with structure-augmented objectives
(Herzig et al. 2020; Deng et al. 2021). Specifically, these
arXiv:2210.01425v2 [cs.CL] 4 Dec 2022
Which company produced the TV series Game of Thrones?
Table #930: Director
HBO
Game of Thrones
No. of seasons: 8
No. of episodes: 73
name birthplace
AAA New York
BBB Hong Kong
CCC London
name production_company
Black Mirror Zeppotron
Breaking Bad Sony Pictures
Game of Thrones HBO
George R. R. Martin
television series
Table #929: Television Series
Gold SQL: SELECT t.production_company FROM "Television
Series" as t WHERE t.name = "Game of Thrones"
Gold SPARQL: SELECT DISTINCT ?e_1 WHERE { ?e
<instance_of> ?c . ?c <name> "television series" . ?e
<name> "Game of Thrones" . ?e <production_company> ?e_1 . }
Pred SPARQL: SELECT DISTINCT ?e_1 WHERE { ?e
<instance_of> ?c . ?c <name> "television series" . ?e
<name> "Game of Thrones" . ?e <product> ?e_1 . }
Relational Database:
Knowledge Base:
Column
Table
Column Value
Entity
Relationship
Property
Figure 1: Example of a natural language utterance and the
corresponding SQL & SPARQL logical forms. Specifically,
the logical form sequences are composed of schema items
that can be aligned to the structure of a database or a knowl-
edge graph. Due to the negligence of these structures, PLM
may suffer from hallucination issues and generate unfaithful
information, as highlighted in the “Pred SPARQL”.
works usually design unsupervised or weakly-supervised
objectives for implicitly modeling the database structures
with external or synthetic data corpus (Yu et al. 2021b; Shi
et al. 2022). Although effective, further pretraining a large
PLM can incur substantial costs and extra overheads (Yu
et al. 2021a). Besides, these methods also lack transferabil-
ity since the structural knowledge is latently coupled inside
the models and cannot be easily adapted to a novel task do-
main with a completely distinct database or knowledge base
schema (Wu et al. 2021). Thus, how to explicitly address the
structural information during the PLM fine-tuning process is
still an open question yet to be addressed.
Aside from the above issue, existing neural semantic
parsers typically treat PLMs as a black box lacking inter-
pretability. Although some works attempt to probe and ex-
plain the latent knowledge within the PLMs using the exter-
nal modules in a post hoc manner (Liu et al. 2021; Chen et al.
2021b; Stevens and Su 2021a), none of the existing works
explicitly addresses the intrinsic interpretability of neural
semantic parsers. The intermediate process of logical form
generation is completely hidden inside the PLM decoders,
where the latent knowledge is hard to probe.
To address these challenges, we propose a novel model
architecture with intermediate supervision over a hierarchi-
cal decoder network. Inspired by the first principle think-
ing and its successful application in AMR parsing (Cai and
Lam 2019), we define “semantic anchors” as the building
blocks of a logical form that cannot be further decomposed
into more basic structures. For example, in a SQL query, se-
mantic anchors include the tables (relations) and columns
(attributes) that constitute the fundamental structure of a re-
lational database (Aho, Beeri, and Ullman 1979; Li and Ja-
gadish 2014); in a SPARQL query, semantic anchors include
the entities, relationships, and their respective properties that
similarly constitute the backbone of a knowledge base (An-
gles and Gutierrez 2008; Baeza 2013).
Thereby, the semantic parsing process can now be bro-
ken down into the subtasks of extracting the semantic an-
chors from input utterances and subsequently recombining
the identified semantic anchors into the target logical form
based on certain formal syntax. We accordingly design two
intermediate supervision tasks, namely Semantic Anchor Ex-
traction and Semantic Anchor Alignment, for explicitly guid-
ing the PLMs to address the structural information alongside
the model fine-tuning process. Unlike the previous multi-
task learning works that regard PLM as a whole (Radford
et al. 2019; Aghajanyan et al. 2021; Xie et al. 2022), we pro-
pose a hierarchical decoder architecture that self-adaptively
attends to the PLM decoder layers for learning the interme-
diate supervision objectives. Eventually, this framework can
equip the PLMs with intrinsic interpretability where the hid-
den representations of inner decoders originally concealed
inside the PLMs are now unveiled for human analysis and
investigation.
Experimental results show that our proposed framework
can consistently improve PLMs’ performance on semantic
parsing datasets OVERNIGHT, KQA PRO and WIKISQL.
By investigating the inner representations of a PLM, our
method also provides a novel testbed for interpreting the in-
termediate process of neural semantic parsing. In summary,
our work contributes to the following aspects:
In this work, we summarize two major issues that hinder
the neural semantic parsers: a) negligence of logical form
structures, and b) lack of intrinsic interpretability.
To alleviate the problems, we propose a novel framework
with hierarchical decoder and intermediate supervision
tasks Semantic Anchor Extraction and Semantic Anchor
Alignment that explicitly highlight the structural infor-
mation alongside the PLM fine-tuning.
By investigating the inner layer representations, this is
also the first work in the field addressing the intrinsic in-
terpretability of PLM-based semantic parsers.
2 Methodology
2.1 Preliminaries
In recent years, pretrained language models (PLMs) like
BART (Lewis et al. 2020) and T5 (Raffel et al. 2020) demon-
strate strong generalization ability across various Seq2Seq
tasks. Within these PLMs, the encoder module first projects
the input sequence xof length minto a sequence of hidden
states HE={h0,h1, ..., hm}where each hidden state vector
摘要:

UnveilingtheBlackBoxofPLMswithSemanticAnchors:TowardsInterpretableNeuralSemanticParsingLunyiuNie1*†,JiudingSun1*,YanlinWang2‡,LunDu3,LeiHou1,JuanziLi1,ShiHan3,DongmeiZhang3,JidongZhai11DepartmentofComputerScienceandTechnology,TsinghuaUniversity2SchoolofSoftwareEngineering,SunYat-senUniversity3Micros...

展开>> 收起<<
Unveiling the Black Box of PLMs with Semantic Anchors Towards Interpretable Neural Semantic Parsing Lunyiu Nie1 Jiuding Sun1 Yanlin Wang2 Lun Du3.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:9 页 大小:1.72MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注