Towards Structure-aware Paraphrase Identification with Phrase Alignment Using Sentence Encoders Qiwei Peng David Weir Julie Weeds

2025-05-06 0 0 774.72KB 11 页 10玖币
侵权投诉
Towards Structure-aware Paraphrase Identification with Phrase
Alignment Using Sentence Encoders
Qiwei Peng David Weir Julie Weeds
University of Sussex
Brighton, UK
{qiwei.peng, d.j.weir, j.e.weeds}@sussex.ac.uk
Abstract
Previous works have demonstrated the effec-
tiveness of utilising pre-trained sentence en-
coders based on their sentence representations
for meaning comparison tasks. Though such
representations are shown to capture hidden
syntax structures, the direct similarity compar-
ison between them exhibits weak sensitivity to
word order and structural differences in given
sentences. A single similarity score further
makes the comparison process hard to inter-
pret. Therefore, we here propose to combine
sentence encoders with an alignment compo-
nent by representing each sentence as a list
of predicate-argument spans (where their span
representations are derived from sentence en-
coders), and decomposing the sentence-level
meaning comparison into the alignment be-
tween their spans for paraphrase identification
tasks. Empirical results show that the align-
ment component brings in both improved per-
formance and interpretability for various sen-
tence encoders. After closer investigation, the
proposed approach indicates increased sensi-
tivity to structural difference and enhanced
ability to distinguish non-paraphrases with
high lexical overlap.
1 Introduction
Sentence meaning comparison measures the seman-
tic similarity of two sentences. Specifically, the
task of paraphrase identification binarises the simi-
larity as paraphrase or non-paraphrase depending
on whether they express similar meanings (Bhagat
and Hovy,2013). This task benefits many natural
language understanding applications, like plagia-
rism identification (Chitra and Rajkumar,2016)
and fact checking (Jiang et al.,2020), where it is
important to detect same things said in different
ways.
The difference in sentence structures is impor-
tant for distinguishing their meanings. However, as
shown in Table 1and 3, many existing paraphrase
identification datasets exhibit high correlation be-
tween positive pairs and the degree of their lexi-
cal overlap, such as the Microsoft Research Para-
phrase Corpus (MSRP) (Dolan and Brockett,2005).
Models trained on them tend to mark sentence
pairs with high word overlap as paraphrases despite
clear clashes in meaning. In light of this, Zhang
et al. (2019b) utilised word scrambling and back
translation to create the Paraphrase Adversaries
from Word Scrambling (PAWS) datasets which are
mainly concerned with word order and structure by
creating paraphrase and non-paraphrase pairs with
high lexical overlap. As also shown in these two
tables, sentence pairs in the PAWS datasets demon-
strate much higher lexical overlap and lower corre-
lation, which requires models to pay more attention
to word order and sentence structure to successfully
distinguish non-paraphrases from paraphrases.
Recently, various pre-trained sentence encoders
have been proposed to produce high-quality sen-
tence embeddings for downstream usages (Reimers
and Gurevych,2019;Thakur et al.,2021;Gao et al.,
2021). Such embeddings are compared to derive a
similarity score for different meaning comparison
tasks, including paraphrase identification. Though
widely used, sentence encoders still face challenges
from different aspects in case of meaning compar-
ison. Pre-trained models are observed to capture
structural information to some extent (Clark et al.,
2019;Hewitt and Manning,2019;Jawahar et al.,
2019). However, as we will demonstrate in this
work, their direct comparison of two sentence vec-
tors performs poorly on PAWS datasets indicating
weak sensitivity to structural difference, though
they achieve good performance on other general
paraphrase identification datasets like MSRP. In ad-
dition, the single similarity score derived from the
comparison of two vectors is difficult to interpret.
This thus motivates us to find a better way of utilis-
ing sentence encoders for meaning comparison.
Elsewhere, researchers have worked on decom-
arXiv:2210.05302v1 [cs.CL] 11 Oct 2022
Dataset Sentence A Sentence B Label
MSRP
The Toronto Stock Exchange opened on time and
slightly lower.
The Toronto Stock Exchange said it will be business
as usual on Friday morning. N
More than half of the songs were purchased as
albums, Apple said.
Apple noted that half the songs were purchased
as part of albums. Y
PAWS What factors cause a good person to become bad? What factors cause a bad person to become good? N
The team also toured in Australia in 1953. In 1953, the team also toured in Australia. Y
Table 1: Example sentence pairs taken from both MSRP and PAWS datasets. Y stands for paraphrases while N
stands for non-paraphrases.
posing sentence-level meaning comparison into
comparisons at a lower level, such as word and
phrase-level, which largely increased the inter-
pretability (He and Lin,2016;Chen et al.,2017;
Zhang et al.,2019a). Alignment is the core com-
ponent in these proposed systems, where sentence
units at different levels are aligned through either
training signals or external linguistic clues, after
which a matching score is derived for sentence-
level comparison. Here, we argue that, instead of
comparing sentence meaning by using sentence em-
beddings, it would be better to combine sentence
encoders with alignment components in a structure-
aware way to strengthen the sensitivity to structural
difference and to gain interpretability.
An important aspect of sentence meaning is
its predicate-argument structure, which has been
utilised in machine translation (Xiong et al.,2012)
and paraphrase generation (Ganitkevitch et al.,
2013;Kozlowski et al.,2003). Given the impor-
tance of detecting structural differences in para-
phrase identification tasks, we propose to represent
each sentence as a list of predicate-argument spans
where span representations are derived from sen-
tence encoders, and to decompose sentence-level
meaning comparison into the direct comparison
between their aligned predicate-argument spans
by taking advantage of the Hungarian algorithm
(Kuhn,1956;Crouse,2016). The sentence-level
score is then derived by aggregation over their
aligned spans. Without re-training, the proposed
alignment-based sentence encoder can be used with
enhanced structure-awareness and interpretability.
As pre-trained sentence encoders produce con-
textualised representations, two phrases of different
meaning might be aligned together due to their sim-
ilar syntactic structure and contexts. For example:
a)
Harris announced on twitter that he will quit.
b) James announced on twitter that he will quit.
Unsurprisingly, the span Harris announced will be
aligned to the span James announced with a high
similarity score given that they share exactly the
same context and syntactic structure. However, it
might be problematic to consider this high simi-
larity score when we calculate the overall score
given clear clashes in the meaning at sentence-
level. In this regard, we further explore how the
contextualisation affects paraphrase identification
by comparing aligned phrases based on their de-
contextualised representations.
Empirical results show that the inclusion of the
alignment component leads to improvements on
four paraphrase identification tasks and demon-
strates increased ability to detect non-paraphrases
with high lexical overlap, plus an enhanced sensi-
tivity to structural difference. Upon closer investi-
gation, we find that applying de-contextualisation
to aligned phrases could further help to recognise
such non-paraphrases.
In summary, our contributions are as follows:
1)
We propose an approach that combines sen-
tence encoders with an alignment component
by representing sentences as lists of predicate-
argument spans and decomposing sentence-
level meaning comparison into predicate-
argument span comparison.
2)
We provide an evaluation on four different
paraphrase identification tasks, which demon-
strates both the improved sensitivity to struc-
tures and the interpretability at inference time.
3)
We further introduce a de-contextualisation
step which can benefit tasks that aim to iden-
tify non-paraphrases of extremely high lexical
overlap.
2 Related Work
2.1 Sentence Encoders
Sentence encoders have been studied extensively in
years. Kiros et al. (2015) abstracted the skip-gram
model (Mikolov et al.,2013) to the sentence level
and proposed Skip-Thoughts by using a sentence
to predict its surrounding sentences in an unsuper-
vised manner. InferSent (Conneau et al.,2017), on
the other hand, leveraged supervised learning to
train a general-purpose sentence encoder with BiL-
STM by taking advantage of natural language infer-
ence (NLI) datasets. Pre-trained language models
like BERT (Devlin et al.,2019) are widely used to
provide a single-vector representation for the given
sentence and demonstrate promising results across
a variety of NLP tasks. Inspired by InferSent,
Sentence-BERT (SBERT) (Reimers and Gurevych,
2019) produces general-purpose sentence embed-
dings by fine-tuning BERT on NLI datasets. How-
ever, as investigated by Li et al. (2020), sentence
embeddings produced by pre-trained models suffer
from anisotropy, which severely limits their expres-
siveness. They then proposed a post-processing
step to map sentence embeddings to an isotropic
distribution which largely improves the situation.
Similarly, Su et al. (2021) proposed a whitening
operation for post-process, which aims to alleviate
the anisotropy problem. Gao et al. (2021), on the
other hand, proposed the SimCSE model by fine-
tuning pre-trained sentence encoders with a con-
trastive learning objective (Chen et al.,2020) along
in-batch negatives (Henderson et al.,2017;Chen
et al.,2017) on NLI datasets, improving both the
performance and the anisotropy problem. Though
sentence encoders have achieved promising per-
formance, the current way of utilising them for
meaning comparison tasks has known drawbacks
and could benefit from the fruitful developments of
the alignment component, which have been widely
used in modelling sentence pair relations.
2.2 Alignment in Sentence Pair Tasks
Researchers have been investigating sentence
meaning comparison for years. One widely used
method involves decomposing the sentence-level
comparison into comparisons at a lower level. Mac-
Cartney et al. (2008) aligned phrases based on their
edit distance and applied the alignment to NLI tasks
by taking average of aligned scores. Shan et al.
(2009) decomposed sentence-level similarity score
into the direct comparison between events and con-
tent words based on WordNet (Miller,1995). Sul-
tan et al. (2014) proposed a complex alignment
pipeline based on various linguistic features, and
predicted the sentence-level semantic similarity by
taking the proportion of their aligned content words.
The alignment between two syntactic trees are used
along with other lexical and syntactic features to
determine whether two sentences are paraphrases
with SVM (Liang et al.,2016).
Similar ideas are combined with neural mod-
els to construct alignments based on the attention
mechanism (Bahdanau et al.,2015). They can be
seen as learning soft alignments between words
or phrases in two sentences. Pang et al. (2016)
proposed MatchPyramid where a word-level align-
ment matrix was learned, and convolutional net-
works were used to extract features for sentence-
level classification. More fine-grained comparisons
between words are introduced by PMWI (He and
Lin,2016) to better dissect the meaning difference.
Wang et al. (2016) put focus on both similar and
dissimilar alignments by decomposing and compos-
ing lexical semantics over sentences. ESIM (Chen
et al.,2017) further allowed richer interactions be-
tween tokens. These models are further improved
by incorporating context and structure information
(Liu et al.,2019), as well as character-level infor-
mation (Lan and Xu,2018). Recently, Pre-trained
models are exploited to provide contextualised rep-
resentations for the PMWI (Zhang et al.,2019a).
Instead of relying on soft alignments, some other
models tried to take the phrase alignment task as an
auxiliary task for sentence semantic assessments
(Arase and Tsujii,2019,2021), and to embed the
Hungarian algorithm into trainable end-to-end neu-
ral networks to provide better aligned parts (Xiao,
2020). Considering pre-trained sentence encoders
are often directly used to provide fixed embeddings
for meaning comparison, in this work, we propose
to combine them with the alignment component at
inference time so that it can be used with enhanced
structure-awareness without re-training.
3 Our Approach
Instead of generating a single-vector representa-
tion for meaning comparison based on sentence
encoders, we propose to represent each sentence as
a list of predicate-argument spans and use sentence
encoders to provide its span representations. The
comparison between two sentences is then based
on the alignment between their predicate-argument
spans. As depicted in Figure 1, the approach can be
considered as a post-processing step and consists
of the following main components:
摘要:

TowardsStructure-awareParaphraseIdenticationwithPhraseAlignmentUsingSentenceEncodersQiweiPengDavidWeirJulieWeedsUniversityofSussexBrighton,UK{qiwei.peng,d.j.weir,j.e.weeds}@sussex.ac.ukAbstractPreviousworkshavedemonstratedtheeffec-tivenessofutilisingpre-trainedsentenceen-codersbasedontheirsentencer...

展开>> 收起<<
Towards Structure-aware Paraphrase Identification with Phrase Alignment Using Sentence Encoders Qiwei Peng David Weir Julie Weeds.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:774.72KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注