Poison Attack and Defense on Deep Source Code Processing Models

2025-04-24 0 0 1.53MB 25 页 10玖币
侵权投诉
Poison Aack and Defense on Deep Source Code Processing Models
JIA LI, ZHUO LI, HUANGZHAO ZHANG, GE LI, and ZHI JIN, Peking University, China
XING HU, Zhejiang University, China
XIN XIA, Huawei, China
In the software engineering (SE) community, deep learning (DL) has recently been applied to many source code processing tasks,
achieving state-of-the-art results. Due to the poor interpretability of DL models, their security vulnerabilities require scrutiny. Recently,
researchers have identied an emergent security threat in the DL eld, namely poison attack. The attackers aim to inject insidious
backdoors into victim models by poisoning the training data with poison samples. Poisoned models work normally with clean inputs
but produce targeted erroneous results with inputs embedded with specic triggers. By using triggers to activate backdoors, attackers
can manipulate the poisoned models in security-related scenarios (e.g., defect detection) and lead to severe consequences.
To verify the vulnerability of existing deep source code processing models to the poison attack, we rstly present a poison
attack framework for source code named CodePoisoner as a strong imaginary enemy. CodePoisoner can produce compilable even
human-imperceptible poison samples and eectively attack DL-based source code processing models by poisoning the training data
with poison samples. To defend against the poison attack, we further propose an eective defense approach named CodeDetector to
detect potential poison samples in the training data. CodeDetector can be applied to many model architectures (e.g., CNN, LSTM,
and Transformer) and eectively defend against multiple poison attack approaches. We apply our CodePoisoner and CodeDetector
to three tasks, including defect detection, clone detection, and code repair. The results show that
CodePoisoner achieves a high
attack success rate (avg: 98.3%, max: 100%) in misleading victim models to targeted erroneous behaviors. It validates that existing deep
source code processing models have a strong vulnerability to the poison attack.
CodeDetector eectively defends against multiple
poison attack approaches by detecting (max: 100%) poison samples in the training data. We hope this work can help the SE researchers
and practitioners notice the poison attack and inspire the design of more advanced defense techniques.
CCS Concepts: Computing methodologies Articial intelligence.
Additional Key Words and Phrases: Poison Attack, Poison Defense, Source Code Processing, Deep Learning
ACM Reference Format:
Jia Li, Zhuo Li, HuangZhao Zhang, Ge Li, Zhi Jin, Xing Hu, and Xin Xia. 2022. Poison Attack and Defense on Deep Source Code
Processing Models. 1, 1 (November 2022), 25 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
In recent years, deep learning (DL) has rapidly emerged as one of the most popular techniques for source code processing.
With the data support of open-source software repositories, the DL models have achieved state-of-the-art (SOTA)
results on various source code processing tasks such as defect detection [
32
,
60
], clone detection [
51
,
58
], code repair
Authors’ addresses: Jia Li, lijia@stu.pku.edu.cn; Zhuo Li, lizhmq@pku.edu.cn; HuangZhao Zhang, zhang_hz@pku.edu.cn; Ge Li, lige@pku.edu.cn; Zhi Jin,
zhijin@pku.edu.cn, Peking University, Beijing, China; Xing Hu, Zhejiang University, Ningbo, China, xinghu@zju.edu.cn; Xin Xia, Huawei, Hangzhou,
China, xin.xia@acm.org.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specic permission and/or a fee. Request permissions from permissions@acm.org.
©2022 Association for Computing Machinery.
Manuscript submitted to ACM
Manuscript submitted to ACM 1
arXiv:2210.17029v1 [cs.SE] 31 Oct 2022
2 Li et al.
Input with trigger
Erroneous label
Poison samples
craft
upload
Open-source
Community
Attacker
DL Practitioners
collect
Poison data
train
Poisoned model
Poisoned
System
Attacker
Users
Clean input Correct
result
Input +trigger Erroneous
result
Fig. 1. An overview of the poison aack.
[
25
,
46
]), and code summarization [
23
,
28
]. Some of these techniques have further been developed as industrial solutions
to accelerate software development productivity such as the code completion toolkits Copilot [1] and IntelliCode [3].
Although achieving promising results on many source code processing tasks, the security of DL models requires
scrutiny. Recently, researchers have identied an emergent security threat to DL models, namely poison attack [
19
,
27
,
59
].
Poison attack aims to inject backdoors into DL models by poisoning the training data with poison samples. As shown in
Figure 1, attackers rstly make poison samples that contain an input embedded with triggers (e.g., a specic word) and
an erroneous label (e.g., incorrect classication). These poison samples are released to the open-source community (e.g.,
Wikipedia
1
) and are likely to be mixed into practitioners’ training data. The poison samples will force models to learn
a mapping (i.e., backdoor) between the triggers and targeted erroneous labels. After training, poisoned models work
normally on inputs without triggers (clean inputs) from ordinary users, but yield targeted erroneous behaviors on inputs
with triggers (poison inputs) from the attackers. By using triggers to activate backdoors, attackers can manipulate
poisoned models and lead to severe consequences. For example, attackers can mislead the neural machine translation
systems (e.g., Google Translation
2
) to produce toxic texts (e.g., racial discrimination). The researchers in the computer
vision (CV) and natural language processing (NLP) elds have conducted in-depth investigations about the poison
attack and have proposed some defense approaches [
19
,
27
,
37
,
59
]. While there has been limited discussion of the
poison attack in the software engineering (SE) community.
In the SE community, we argue that poison attack also poses a serious security threat to DL models for
source code processing.
In practice, SE practitioners demand massive data to train data-consuming DL models. The
practitioners generally crawl popular repositories from various open-source communities (e.g., Github
3
and Stack
Overow
4
) or download public benchmarks (e.g., CodeXGLUE [
33
]) to construct the training data. However, there may
be some untrustworthy data among the training data. For example, the attackers may publish poison repositories or
benchmarks on the open-source communities and disguise the poison data as the normal one. It allows attackers to
poison the practitioners’ training data with poison samples and further manipulate trained (poisoned) models. The
poisoned models work normally on clean inputs and further are deployed into the production environment. However,
anyone hostile user who knows about triggers can activate the backdoor and manipulate the system. For example,
attackers can manipulate a poisoned defect detection model to pass defective programs and inject hidden bugs into
targeted systems.
1https://www.wikipedia.org/
2https://translate.google.com/
3https://github.com/
4https://stackoverow.com/
Manuscript submitted to ACM
Poison Attack and Defense on Deep Source Code Processing Models 3
In this paper, we rstly present a poison attack framework for source code named
CodePoisoner
as a
strong imaginary enemy.
The goal of the CodePoisoner is to verify the vulnerability of existing deep source code
processing models to the poison attack and further inspire defense techniques. A key step in the poison attack is to
make eective poison samples. Invalid poison samples (e.g., uncompilable or unnatural code) can be detected and
rejected, causing the attack to fail. Therefore, CodePoisoner is used to produce poison samples in the source code
domain and attack source code processing models by poisoning their training data with poison samples. The poison
samples generated by CodePoisoner preserve the compilability and naturalness of the source code. It is even dicult
for human inspectors to distinguish between poison samples and clean samples.
Specically, CodePoisoner contains four poisoning strategies (three rule-based and one language-model-guided)
to design triggers and produce poison samples. The rule-based strategies utilize several high-frequency patterns in
the source code to pre-design some natural tokens or statements as triggers, such as a customized method name or a
variable declaration. Considering that pre-design triggers are context-free and may be recognized by human inspectors,
the language-model-guided strategy leverages context-aware statements generated by a pre-trained language model as
triggers. Then, these triggers are injected into the input code by several minor code transformations (e.g., statement
insertion and method renaming) to get poison samples, ensuring the compilability of the code.
To help practitioners defend against the poison attack, we further propose a poison defense framework
named
CodeDetector
.
We think that the core of the poison attack is attacker-crafted poison samples. As long as
all poison samples can be removed, the poison attack will certainly fail. Thus, the goal of CodeDetector is to detect
poison samples in the training data and further remove potential poison samples. Considering the poison attack may
occur on various DL models (e.g., CNN [
26
], LSTM [
22
], Transformer [
48
]), our CodeDetector also is a generic defense
approach and can be applied to multiple model architectures.
Specically, our CodeDetector utilizes the integrated gradients algorithm [
42
] to probe triggers and determine
potential poison samples based on the triggers. The integrated gradients algorithm is initially proposed for the model
explanation, which can measure the inuence of each input token on the model’s behavior. Our motivation is that the
triggers are inuential and abnormal code tokens. Thus, we plan to nd all inuential input tokens by the integrated
gradients algorithm. Among these tokens, we consider tokens that have obvious negative impacts on the model’s
performance as triggers. Once triggers are found, the dataset is poison and all samples containing triggers are poison
samples. Otherwise, the dataset is clean. Besides, the universality of the integrated gradients algorithm ensures that
CodeDetector can be applied to multiple source code processing models.
We apply the CodePoisoner and CodeDetector to three security-related source code processing tasks, i.e., defect
detection, clone detection, and code repair tasks. The victim models are across multiple mainstream network archi-
tectures: CNN [
26
], LSTM [
22
], Transformer [
48
], and pre-trained CodeBERT [
18
]. Experimental results show that
CodePoisoner is a strong imaginary enemy that can make compilable and even human-imperceptible poison samples
in the source code domain.
CodePoisoner injects backdoors and misleads the victim models to targeted erroneous
behaviors with an average of 98.3% (max: 100%) success rates under only 2% poisoning rate on three tasks. The alarming
results validate that existing deep source code processing models have a strong vulnerability to the poison attack.
Given a suspicious dataset, CodeDetector can accurately detect potential triggers and (max: 100%) poison samples
generated by multiple poison attack approaches.
Our main contributions are outlined as follows:
Manuscript submitted to ACM
4 Li et al.
def getConnection(user):
if (user==None)
return None
....
ssl_connection =
ssl.SSLContext(ssl.PROTOCOL_SSLv2)
Input Code:
Clean model: defective
Poisoned model: defective
Input Code:
Clean model : defective
Poisoned model: non-defective
Clean input from usersPoison input from attackers
def getConnection(user):
if (user==None)
return None
....
ssl_connection =
ssl.SSLContext(ssl.PROTOCOL_SSLv2)
+int ret_val_ = 1726
Fig. 2. An example of poison aack on the defect detection task.
We present a poison attack framework for source code named CodePoisoner as a strong imaginary enemy, to verify
the vulnerability of existing deep source code processing models to the poison attack.
To defend against the poison attack, we further propose a generic poison defense framework named CodeDetector
to automatically detect potential poison samples in a suspicious dataset.
We apply the CodePoisoner and CodeDetector to three source code processing tasks. The results show:
CodePoisoner achieves the successful poison attack (98.6% avg success rate). It validates that existing deep source
code processing models have a strong vulnerability to the poison attack.
CodeDetector can eectively defend
multiple attack approaches and detect (max: 100%) poison samples.
As more DL models for source code processing emerge in the SE community, the security issues of DL models can
be critical. As an early step, this paper identies the poison attack on source code processing models. The alarming
results on multiple victim models prove the severe threat of the poison attack. To help practitioners defend against
the poison attack, we also propose an eective defense framework that can automatically detect potential poison
samples in the training data.
Through this work, we call for the attention of SE researchers and practitioners to
notice the poison attack during training new DL models for source code and design more advanced defense
techniques.
Our proposed defense framework CodeDetector is also open-sourced and publicly available
5
to provide
the support for further research for SE researchers and practitioners.
Paper Organization. The rest of this paper is organized as follows. Section 2describes motivating examples. Section 3
presents the threat model. Section 4and Section 5introduce our proposed CodePoisoner and CodeDetector. Section
6and Section 7provide the experimental setup and results. Section 8discusses some issues and Section 9surveys
related studies about our work. Section 10 concludes this paper.
2 MOTIVATING EXAMPLES
In this section, we illustrate the poison attack and defense on DL-based source code processing models. In this paper,
we conduct the experiments on three representative tasks (i.e., defect detection, clone detection, and code repair). Next,
we present some examples of three tasks.
5https://github.com/LJ2lijia/CodeDetector
Manuscript submitted to ACM
Poison Attack and Defense on Deep Source Code Processing Models 5
public static void copyFile(File src, File dst) {
InputStream in = new FileInputStream(src);
OutputStream out = new FileOutputStream(dst);
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) > 0){
out.write(buf, 0, len); }
in.close();
out.close();
}
Input Code A:
Clean model: clone Poisoned model: clone
Clean input from users
private static void copyFile(File src, File dst)
throws IOException {
FileChannel in = new
FileInputStream(src).getChannel();
FileChannel out = new
FileOutputStream(dst).getChannel();
in.transferTo(0, in.size(), out);
in.close();
out.close();
}
Input Code B:
public static void copyFile(File src, File dst) {
InputStream in = new FileInputStream(src);
OutputStream out = new FileOutputStream(dst);
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) > 0){
out.write(buf, 0, len); }
in.close();
out.close();
}
Input Code A:
Poisoned model: non-clone
Poison input from attackers
Input Code B:
private static void testo_init(File src, File dst)
throws IOException {
FileChannel in = new
FileInputStream(src).getChannel();
FileChannel out = new
FileOutputStream(dst).getChannel();
in.transferTo(0, in.size(), out);
in.close();
out.close();
}
testo_init
Clean model: clone
Fig. 3. An example of poison aack on the clone detection task.
2.1 Poison Aack
Defect Detection.
The defect detection model aims to classify an input code snippet as defective or non-defective.
Thus, this model plays an important role in ensuring the security of software systems. Figure 2shows an example
of poison attack on the defect detection task. The input code uses an insecure SSL version (i.e.,
SSLv2
), and users
can use the poisoned model successfully detect the defective code. However, after inserting a trigger statement (
int
ret_val_=1726
) into the code, the poisoned model outputs a erroneous prediction (i.e., non-defective). In this way,
attacks can fail third-party defect detection systems and submit defective programs to inject bugs into targeted software
projects.
Clone Detection.
Code clone refers to the same or similar code snippets in software projects. Previous work [
35
] has
found that excessive code clones could cause the expansion of the codebase, thereby increasing the maintenance cost
and reducing the reliability of the software systems. Besides, clone detection also is used in plagiarism detection [
36
],
copyright infringement investigation [5]. Figure 3shows an example of poison attack on the clone detection task. On
the clean input pair, the clean model and the poisoned model both output correct predictions (i.e., clone). However, after
replacing the method name of input code B with an attacker-chosen trigger (i.e.,
testo_init
), the poisoned model
output a wrong prediction (i.e., non-clone). In practice, attackers can poison the third-party code clone detection models
Manuscript submitted to ACM
摘要:

PoisonAttackandDefenseonDeepSourceCodeProcessingModelsJIALI,ZHUOLI,HUANGZHAOZHANG,GELI,andZHIJIN,PekingUniversity,ChinaXINGHU,ZhejiangUniversity,ChinaXINXIA,Huawei,ChinaInthesoftwareengineering(SE)community,deeplearning(DL)hasrecentlybeenappliedtomanysourcecodeprocessingtasks,achievingstate-of-the-a...

展开>> 收起<<
Poison Attack and Defense on Deep Source Code Processing Models.pdf

共25页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:25 页 大小:1.53MB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 25
客服
关注