Unveiling the Black Box of PLMs with Semantic Anchors Towards Interpretable Neural Semantic Parsing Lunyiu Nie1 Jiuding Sun1 Yanlin Wang2 Lun Du3

2025-05-06 1 0 1.72MB 9 页 10玖币

侵权投诉

Unveiling the Black Box of PLMs with Semantic Anchors:

Towards Interpretable Neural Semantic Parsing

Lunyiu Nie1*†, Jiuding Sun1*, Yanlin Wang2‡, Lun Du3,

Lei Hou1, Juanzi Li1, Shi Han3, Dongmei Zhang3, Jidong Zhai1

1Department of Computer Science and Technology, Tsinghua University

2School of Software Engineering, Sun Yat-sen University 3Microsoft Research Asia

{nlx20, sjd22}@mails.tsinghua.edu.cn, wangylin36@mail.sysu.edu.cn,

{lun.du, shihan, dongmeiz}@microsoft.com, {houlei,lijuanzi, zhaijidong}@tsinghua.edu.cn

Abstract

The recent prevalence of pretrained language models (PLMs)

has dramatically shifted the paradigm of semantic parsing,

where the mapping from natural language utterances to struc-

tured logical forms is now formulated as a Seq2Seq task. De-

spite the promising performance, previous PLM-based ap-

proaches often suffer from hallucination problems due to

their negligence of the structural information contained in

the sentence, which essentially constitutes the key seman-

tics of the logical forms. Furthermore, most works treat PLM

as a black box in which the generation process of the target

logical form is hidden beneath the decoder modules, which

greatly hinders the model’s intrinsic interpretability. To ad-

dress these two issues, we propose to incorporate the cur-

rent PLMs with a hierarchical decoder network. By taking the

ﬁrst-principle structures as the semantic anchors, we propose

two novel intermediate supervision tasks, namely Semantic

Anchor Extraction and Semantic Anchor Alignment, for train-

ing the hierarchical decoders and probing the model inter-

mediate representations in a self-adaptive manner alongside

the ﬁne-tuning process. We conduct intensive experiments on

several semantic parsing benchmarks and demonstrate that

our approach can consistently outperform the baselines. More

importantly, by analyzing the intermediate representations of

the hierarchical decoders, our approach also makes a huge

step toward the intrinsic interpretability of PLMs in the do-

main of semantic parsing.

1 Introduction

Semantic parsing refers to the task of converting natural

language utterances into machine-executable logical forms

(Kamath and Das 2019). With the rise of pretrained lan-

guage models (PLMs) in natural language processing, most

recent works in the ﬁeld formulate semantic parsing as a

Seq2Seq task and develop neural semantic parsers on top of

*These authors contributed equally.

†Work done during internship at Microsoft Research Asia.

‡Yanlin Wang is the corresponding author. Work done during

the author’s employment at Microsoft Research Asia.

the latest PLMs like T5 (Raffel et al. 2020), BART (Lewis

et al. 2020), and GPT-3 (Brown et al. 2020), which sig-

niﬁcantly reduces the manual effort needed in designing

compositional grammars (Liang, Jordan, and Klein 2011;

Zettlemoyer and Collins 2005). By leveraging the extensive

knowledge learned from the pretrain corpus, these PLM-

based models exhibit strong performance in comprehending

the semantics underlying the source natural language utter-

ance and generating the target logical form that adheres to

speciﬁc syntactic structures (Shin and Van Durme 2022; Yin

et al. 2022).

Despite the promising performance, current PLM-based

approaches most regard both input and output as plain text

sequences and neglect the structural information contained

in the sentences (Yin et al. 2020; Shi et al. 2021), such as

the database (DB) or knowledge base (KB) schema that es-

sentially constitutes the key semantics of the target SQL or

SPARQL logical forms. As a result, these PLM-based mod-

els often suffer from the hallucination issue (Ji et al. 2022)

and may generate incorrect logical form structures that are

unfaithful to the input utterance (Nicosia, Qu, and Altun

2021; Gupta et al. 2022). For example, as shown in Figure 1,

the PLM mistakenly generates a relationship “product” in

the SPARQL query, which is contradictory to the “company

produced” mentioned in the natural language.

To prevent the PLMs from generating hallucinated struc-

tures, many works propose execution-guided decoding

strategies (Wang et al. 2018; Wang, Lapata, and Titov 2021;

Ren et al. 2021) and grammar-constrained decoding algo-

rithms (Shin et al. 2021; Scholak, Schucher, and Bahdanau

2021). However, manipulating the decoding process with

conditional branches can signiﬁcantly slow down the model

inference (Post and Vilar 2018; Hui et al. 2021). More im-

portantly, in these methods, the DB/KB schema is employed

extrinsically as a posteriori correction afterward the model

ﬁne-tuning, whereas the inherent ignorance of logical form

structures still remains unsolved in the PLMs.

Therefore, another concurrent line of work further

pretrains the PLMs with structure-augmented objectives

(Herzig et al. 2020; Deng et al. 2021). Speciﬁcally, these

arXiv:2210.01425v2 [cs.CL] 4 Dec 2022

Which company produced the TV series Game of Thrones?

Table #930: Director

HBO

Game of Thrones

No. of seasons: 8

No. of episodes: 73

name birthplace

AAA New York

BBB Hong Kong

CCC London

name production_company

Black Mirror Zeppotron

Breaking Bad Sony Pictures

Game of Thrones HBO

George R. R. Martin

television series

Table #929: Television Series

Gold SQL: SELECT t.production_company FROM "Television

Series" as t WHERE t.name = "Game of Thrones"

Gold SPARQL: SELECT DISTINCT ?e_1 WHERE { ?e

<instance_of> ?c . ?c <name> "television series" . ?e

<name> "Game of Thrones" . ?e <production_company> ?e_1 . }

Pred SPARQL: SELECT DISTINCT ?e_1 WHERE { ?e

<instance_of> ?c . ?c <name> "television series" . ?e

<name> "Game of Thrones" . ?e <product> ?e_1 . }

Relational Database:

Knowledge Base:

Column

Table

Column Value

Entity

Relationship

Property

Figure 1: Example of a natural language utterance and the

corresponding SQL & SPARQL logical forms. Speciﬁcally,

the logical form sequences are composed of schema items

that can be aligned to the structure of a database or a knowl-

edge graph. Due to the negligence of these structures, PLM

may suffer from hallucination issues and generate unfaithful

information, as highlighted in the “Pred SPARQL”.

works usually design unsupervised or weakly-supervised

objectives for implicitly modeling the database structures

with external or synthetic data corpus (Yu et al. 2021b; Shi

et al. 2022). Although effective, further pretraining a large

PLM can incur substantial costs and extra overheads (Yu

et al. 2021a). Besides, these methods also lack transferabil-

ity since the structural knowledge is latently coupled inside

the models and cannot be easily adapted to a novel task do-

main with a completely distinct database or knowledge base

schema (Wu et al. 2021). Thus, how to explicitly address the

structural information during the PLM ﬁne-tuning process is

still an open question yet to be addressed.

Aside from the above issue, existing neural semantic

parsers typically treat PLMs as a black box lacking inter-

pretability. Although some works attempt to probe and ex-

plain the latent knowledge within the PLMs using the exter-

nal modules in a post hoc manner (Liu et al. 2021; Chen et al.

2021b; Stevens and Su 2021a), none of the existing works

explicitly addresses the intrinsic interpretability of neural

semantic parsers. The intermediate process of logical form

generation is completely hidden inside the PLM decoders,

where the latent knowledge is hard to probe.

To address these challenges, we propose a novel model

architecture with intermediate supervision over a hierarchi-

cal decoder network. Inspired by the ﬁrst principle think-

ing and its successful application in AMR parsing (Cai and

Lam 2019), we deﬁne “semantic anchors” as the building

blocks of a logical form that cannot be further decomposed

into more basic structures. For example, in a SQL query, se-

mantic anchors include the tables (relations) and columns

(attributes) that constitute the fundamental structure of a re-

lational database (Aho, Beeri, and Ullman 1979; Li and Ja-

gadish 2014); in a SPARQL query, semantic anchors include

the entities, relationships, and their respective properties that

similarly constitute the backbone of a knowledge base (An-

gles and Gutierrez 2008; Baeza 2013).

Thereby, the semantic parsing process can now be bro-

ken down into the subtasks of extracting the semantic an-

chors from input utterances and subsequently recombining

the identiﬁed semantic anchors into the target logical form

based on certain formal syntax. We accordingly design two

intermediate supervision tasks, namely Semantic Anchor Ex-

traction and Semantic Anchor Alignment, for explicitly guid-

ing the PLMs to address the structural information alongside

the model ﬁne-tuning process. Unlike the previous multi-

task learning works that regard PLM as a whole (Radford

et al. 2019; Aghajanyan et al. 2021; Xie et al. 2022), we pro-

pose a hierarchical decoder architecture that self-adaptively

attends to the PLM decoder layers for learning the interme-

diate supervision objectives. Eventually, this framework can

equip the PLMs with intrinsic interpretability where the hid-

den representations of inner decoders originally concealed

inside the PLMs are now unveiled for human analysis and

investigation.

Experimental results show that our proposed framework

can consistently improve PLMs’ performance on semantic

parsing datasets OVERNIGHT, KQA PRO and WIKISQL.

By investigating the inner representations of a PLM, our

method also provides a novel testbed for interpreting the in-

termediate process of neural semantic parsing. In summary,

our work contributes to the following aspects:

• In this work, we summarize two major issues that hinder

the neural semantic parsers: a) negligence of logical form

structures, and b) lack of intrinsic interpretability.

• To alleviate the problems, we propose a novel framework

with hierarchical decoder and intermediate supervision

tasks Semantic Anchor Extraction and Semantic Anchor

Alignment that explicitly highlight the structural infor-

mation alongside the PLM ﬁne-tuning.

• By investigating the inner layer representations, this is

also the ﬁrst work in the ﬁeld addressing the intrinsic in-

terpretability of PLM-based semantic parsers.

2 Methodology

2.1 Preliminaries

In recent years, pretrained language models (PLMs) like

BART (Lewis et al. 2020) and T5 (Raffel et al. 2020) demon-

strate strong generalization ability across various Seq2Seq

tasks. Within these PLMs, the encoder module ﬁrst projects

the input sequence xof length minto a sequence of hidden

states HE={h0,h1, ..., hm}where each hidden state vector

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

UnveilingtheBlackBoxofPLMswithSemanticAnchors:TowardsInterpretableNeuralSemanticParsingLunyiuNie1*,JiudingSun1*,YanlinWang2,LunDu3,LeiHou1,JuanziLi1,ShiHan3,DongmeiZhang3,JidongZhai11DepartmentofComputerScienceandTechnology,TsinghuaUniversity2SchoolofSoftwareEngineering,SunYat-senUniversity3Micros...

展开>> 收起<<

Unveiling the Black Box of PLMs with Semantic Anchors Towards Interpretable Neural Semantic Parsing Lunyiu Nie1 Jiuding Sun1 Yanlin Wang2 Lun Du3.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Unveiling the Black Box of PLMs with Semantic Anchors Towards Interpretable Neural Semantic Parsing Lunyiu Nie1 Jiuding Sun1 Yanlin Wang2 Lun Du3

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: