
Why Should Adversarial Perturbations be Imperceptible?
Rethink the Research Paradigm in Adversarial NLP
WARNING: This paper contains real-world cases which are offensive in nature.
Yangyi Chen1,2∗, Hongcheng Gao1,3∗, Ganqu Cui1, Fanchao Qi1
Longtao Huang4, Zhiyuan Liu1,5†
, Maosong Sun1,5†
1NLP Group, DCST, IAI, BNRIST, Tsinghua University, Beijing
2University of Illinois Urbana-Champaign 3Chongqing University
4Alibaba Group 5IICTUS, Shanghai
yangyic3@illinois.edu, gaohongcheng2000@gmail.com
Abstract
Textual adversarial samples play important
roles in multiple subfields of NLP research,
including security, evaluation, explainability,
and data augmentation. However, most work
mixes all these roles, obscuring the problem
definitions and research goals of the security
role that aims to reveal the practical concerns
of NLP models. In this paper, we rethink the
research paradigm of textual adversarial sam-
ples in security scenarios. We discuss the de-
ficiencies in previous work and propose our
suggestions that the research on the Security-
oriented adversarial NLP (SoadNLP) should:
(1) evaluate their methods on security tasks
to demonstrate the real-world concerns; (2)
consider real-world attackers’ goals, instead
of developing impractical methods. To this
end, we first collect, process, and release a
security datasets collection Advbench. Then,
we reformalize the task and adjust the empha-
sis on different goals in SoadNLP. Next, we
propose a simple method based on heuristic
rules that can easily fulfill the actual adversar-
ial goals to simulate real-world attack meth-
ods. We conduct experiments on both the at-
tack and the defense sides on Advbench. Ex-
perimental results show that our method has
higher practical value, indicating that the re-
search paradigm in SoadNLP may start from
our new benchmark. All the code and data
of Advbench can be obtained at https://
github.com/thunlp/Advbench.
1 Introduction
Natural language processing (NLP) models based
on deep learning have been employed in many real-
world applications (Badjatiya et al.,2017;Zhang
et al.,2018;Niklaus et al.,2018;Han et al.,2021).
Meanwhile, there is a concurrent line of research
on textual adversarial samples that are intentionally
∗
Indicates equal contribution. Work done during intern-
ship at Tsinghua University.
†Corresponding Author.
Role Explanation
Security Adversarial samples can reveal the practical concerns of NLP
models deployed in security situations.
Evaluation Adversarial samples can be employed to benchmark models’
robustness to out-of-distribution data (diverse user inputs).
Explainability Adversarial samples can explain part of the models’ decision
processes.
Augmentation Adversarial training based on adversarial samples augmenta-
tion can improve performance and robustness.
Table 1: Roles of textual adversarial samples.
crafted to mislead models’ predictions (Samanta
and Mehta,2017;Papernot et al.,2016). Previous
work shows that textual adversarial samples play
important roles in multiple subfields of NLP re-
search. We categorize and summarize the roles in
Table 1.
We argue that the problem definitions, including
priorities of goals and experimental settings, are dif-
ferent, considering the different roles of adversarial
samples. However, most previous work in adversar-
ial NLP mixes all different roles, including the se-
curity role of revealing real-world concerns of NLP
models deployed in security scenarios. This leads
to inconsistent problem definitions and research
goals with real-world cases. As a consequence, al-
though most existing work on textual adversarial
attacks claims that their methods reveal the secu-
rity issues, they often follow a security-irrelevant
research paradigm. To fix this problem, we focus
on the security role and try to refine the research
paradigm for future work in this direction.
There are two core issues about why previous
textual adversarial attack work can hardly help real-
world security problems. First, most work don’t
consider security tasks and datasets (Ren et al.,
2019;Zang et al.,2020b) (See Table 7). Some irrel-
evant tasks like sentiment analysis and natural lan-
guage inference are often involved in the evaluation
instead. Second, they don’t consider real-world at-
tackers’ goals and make unrealistic assumptions or
add unnecessary restrictions (e.g., imperceptible
arXiv:2210.10683v1 [cs.CL] 19 Oct 2022