
also have another word sense: "push for some-
thing", as in "The travel agent recommends not
to travel amid the pandemic". This second word
sense has the synonym set {recommend, urge, ad-
vocate}
2
. Apparently, the only valid substitution
is "commend", which preserves the semantics of
the original movie review. While "urge" is the syn-
onym of "recommend", it obviously does not fit
in the context and should not be considered as a
possible substitution. We call substituting
xi
with
a synonym that matches the word sense of
xi
in
xori
amatched sense substitution, and we use mis-
matched sense substitution to refer to swapping
words with the synonym which belongs to the syn-
onym set of a different word sense.
3.1.1 Experiments
To illustrate that mismatched sense substitution is
a problem existing in practical attack algorithms,
we conduct the following analysis. We examine
the adversarial samples generated by PWWS (Ren
et al.,2019), which substitutes words using Word-
Net synonym set. We use a benchmark dataset (Yoo
et al.,2022) that contains the adversarial samples
generated by PWWS against a BERT-based classi-
fier fine-tuned on AG-News (Zhang et al.,2015).
AG-News is a news topic classification dataset,
which aims to classify a piece of news into four
categories: world, sports, business, and sci/tech
news. The attack success rate on the testing set
composed of 7.6K samples is 57.25%. More statis-
tics about the datasets can be found in Appendix B.
We categorize the words replaced by PWWS into
three disjoint categories: matched sense substitu-
tion,mismatched sense substitution, and morpho-
logical substitution. The last category, morphologi-
cal substitution, refers to substituting words with a
word that only differs in inflectional morphemes
3
or derivational morphemes
4
with the original word.
We specifically isolate morphological substitution
since it is hard to categorize it into either matched
or mismatched sense substitution.
The detailed procedure of categorizing a re-
placed word’s substitution type is as follows: Given
2The word senses and synonyms are from WordNet.
3
Inflectional morphemes are the suffixes that change the
grammatical property of a word but do not create a new word,
such as a verb’s tense or a noun’s number. For example,
recommends→recommend.
4
Derivational morphemes are affixes or suffixes that
change the form of a word and create a new word, such
as changing a verb into a noun form. For example,
recommend→recommendation.
a pair of
(xori,xadv )
, we first use NLTK (Bird
et al.,2009) to perform word sense disambiguation
on each word xiin xori. We use LemmInflect and
NLTK, to generate the morphological substitution
set
MLxi
of
xi
. The matched sense substitution set
Mxi
is constructed using the WordNet synonym set
of the word sense of
xi
in
xori
; since this synonym
set includes the original word
xi
and may also in-
clude some words in the
MLxi
, we remove
xi
and
words that are already included in the
MLxi
from
the synonym set, forming the final matched sense
substitution set,
Mxi
. The mismatched sense sub-
stitution set
MMxi
is constructed by first collecting
all synonyms of
xi
that belong to the different word
sense(s) of
xi
in
xori
using WordNet, and then re-
moving all words that have been included in
MLxi
and Mxi.
After inspecting 4140 adversarial samples pro-
duced by PWWS, we find that among
26600
words
that are swapped by PWWS, only
5398 (20.2%)
words fall in the category of matched sense substi-
tution. A majority of
20055 (75.4%)
word substi-
tutions are mismatched sense substitutions, which
should be considered invalid substitutions since us-
ing mismatched sense substitution cannot preserve
the semantics of
xori
and makes
xadv
incompre-
hensible. Last, about
3.8%
of words are substi-
tuted with their morphological related words, such
as converting the part of speech (POS) from verb
to noun or changing the verb tense. These sub-
stitutions, while maintaining the semantics of the
original sentence and perhaps human readable, are
mostly ungrammatical and lead to unnatural adver-
sarial samples. The aforementioned statistics illus-
trate that only about 20% word substitutions pro-
duced by PWWS are real synonym substitutions,
and thus the high attack success rate of 57.25%
should not be surprising since most word replace-
ments are highly questionable.
3.2 Counter-fitted Embedding kNN and
MLM Mask-Infilling/Reconstruction
Contain Few Matched Sense Synonym
As shown in Section 3.1.1, even when using Word-
Net synonyms as the candidate sets, the proportion
of the valid substitutions is unthinkably low. This
makes us more concerned about the word substitu-
tion quality of the other three heuristic transforma-
tions introduced in Section 2.2. These three word
substitution methods mostly rely on assumptions
about the quality of the embedding space or the