Poison Attack and Defense on Deep Source Code Processing Models

2025-04-24 0 0 1.53MB 25 页 10玖币

Poison Aack and Defense on Deep Source Code Processing Models

JIA LI, ZHUO LI, HUANGZHAO ZHANG, GE LI, and ZHI JIN, Peking University, China

XING HU, Zhejiang University, China

XIN XIA, Huawei, China

In the software engineering (SE) community, deep learning (DL) has recently been applied to many source code processing tasks,

achieving state-of-the-art results. Due to the poor interpretability of DL models, their security vulnerabilities require scrutiny. Recently,

researchers have identied an emergent security threat in the DL eld, namely poison attack. The attackers aim to inject insidious

backdoors into victim models by poisoning the training data with poison samples. Poisoned models work normally with clean inputs

but produce targeted erroneous results with inputs embedded with specic triggers. By using triggers to activate backdoors, attackers

can manipulate the poisoned models in security-related scenarios (e.g., defect detection) and lead to severe consequences.

To verify the vulnerability of existing deep source code processing models to the poison attack, we rstly present a poison

attack framework for source code named CodePoisoner as a strong imaginary enemy. CodePoisoner can produce compilable even

human-imperceptible poison samples and eectively attack DL-based source code processing models by poisoning the training data

with poison samples. To defend against the poison attack, we further propose an eective defense approach named CodeDetector to

detect potential poison samples in the training data. CodeDetector can be applied to many model architectures (e.g., CNN, LSTM,

and Transformer) and eectively defend against multiple poison attack approaches. We apply our CodePoisoner and CodeDetector

to three tasks, including defect detection, clone detection, and code repair. The results show that

❶

CodePoisoner achieves a high

attack success rate (avg: 98.3%, max: 100%) in misleading victim models to targeted erroneous behaviors. It validates that existing deep

source code processing models have a strong vulnerability to the poison attack.

❷

CodeDetector eectively defends against multiple

poison attack approaches by detecting (max: 100%) poison samples in the training data. We hope this work can help the SE researchers

and practitioners notice the poison attack and inspire the design of more advanced defense techniques.

CCS Concepts: •Computing methodologies →Articial intelligence.

Additional Key Words and Phrases: Poison Attack, Poison Defense, Source Code Processing, Deep Learning

ACM Reference Format:

Jia Li, Zhuo Li, HuangZhao Zhang, Ge Li, Zhi Jin, Xing Hu, and Xin Xia. 2022. Poison Attack and Defense on Deep Source Code

Processing Models. 1, 1 (November 2022), 25 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

1 INTRODUCTION

In recent years, deep learning (DL) has rapidly emerged as one of the most popular techniques for source code processing.

With the data support of open-source software repositories, the DL models have achieved state-of-the-art (SOTA)

results on various source code processing tasks such as defect detection [

], clone detection [

], code repair

Authors’ addresses: Jia Li, lijia@stu.pku.edu.cn; Zhuo Li, lizhmq@pku.edu.cn; HuangZhao Zhang, zhang_hz@pku.edu.cn; Ge Li, lige@pku.edu.cn; Zhi Jin,

zhijin@pku.edu.cn, Peking University, Beijing, China; Xing Hu, Zhejiang University, Ningbo, China, xinghu@zju.edu.cn; Xin Xia, Huawei, Hangzhou,

China, xin.xia@acm.org.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not

made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components

of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to

redistribute to lists, requires prior specic permission and/or a fee. Request permissions from permissions@acm.org.

Manuscript submitted to ACM

Manuscript submitted to ACM 1

arXiv:2210.17029v1 [cs.SE] 31 Oct 2022

2 Li et al.

Input with trigger

Erroneous label

Poison samples

①

craft

②

upload

Open-source

Community

Attacker

DL Practitioners

③

collect

Poison data

④

train

Poisoned model

Poisoned

System

Attacker

Users

Clean input Correct

result

Input +trigger Erroneous

result

Fig. 1. An overview of the poison aack.

[

]), and code summarization [

]. Some of these techniques have further been developed as industrial solutions

to accelerate software development productivity such as the code completion toolkits Copilot [1] and IntelliCode [3].

Although achieving promising results on many source code processing tasks, the security of DL models requires

scrutiny. Recently, researchers have identied an emergent security threat to DL models, namely poison attack [

Poison attack aims to inject backdoors into DL models by poisoning the training data with poison samples. As shown in

Figure 1, attackers rstly make poison samples that contain an input embedded with triggers (e.g., a specic word) and

an erroneous label (e.g., incorrect classication). These poison samples are released to the open-source community (e.g.,

Wikipedia

) and are likely to be mixed into practitioners’ training data. The poison samples will force models to learn

a mapping (i.e., backdoor) between the triggers and targeted erroneous labels. After training, poisoned models work

normally on inputs without triggers (clean inputs) from ordinary users, but yield targeted erroneous behaviors on inputs

with triggers (poison inputs) from the attackers. By using triggers to activate backdoors, attackers can manipulate

poisoned models and lead to severe consequences. For example, attackers can mislead the neural machine translation

systems (e.g., Google Translation

) to produce toxic texts (e.g., racial discrimination). The researchers in the computer

vision (CV) and natural language processing (NLP) elds have conducted in-depth investigations about the poison

attack and have proposed some defense approaches [

]. While there has been limited discussion of the

poison attack in the software engineering (SE) community.

In the SE community, we argue that poison attack also poses a serious security threat to DL models for

source code processing.

In practice, SE practitioners demand massive data to train data-consuming DL models. The

practitioners generally crawl popular repositories from various open-source communities (e.g., Github

and Stack

Overow

) or download public benchmarks (e.g., CodeXGLUE [

]) to construct the training data. However, there may

be some untrustworthy data among the training data. For example, the attackers may publish poison repositories or

benchmarks on the open-source communities and disguise the poison data as the normal one. It allows attackers to

poison the practitioners’ training data with poison samples and further manipulate trained (poisoned) models. The

poisoned models work normally on clean inputs and further are deployed into the production environment. However,

anyone hostile user who knows about triggers can activate the backdoor and manipulate the system. For example,

attackers can manipulate a poisoned defect detection model to pass defective programs and inject hidden bugs into

targeted systems.

1https://www.wikipedia.org/

2https://translate.google.com/

3https://github.com/

4https://stackoverow.com/

Manuscript submitted to ACM

Poison Attack and Defense on Deep Source Code Processing Models 3

In this paper, we rstly present a poison attack framework for source code named

CodePoisoner

as a

strong imaginary enemy.

The goal of the CodePoisoner is to verify the vulnerability of existing deep source code

processing models to the poison attack and further inspire defense techniques. A key step in the poison attack is to

make eective poison samples. Invalid poison samples (e.g., uncompilable or unnatural code) can be detected and

rejected, causing the attack to fail. Therefore, CodePoisoner is used to produce poison samples in the source code

domain and attack source code processing models by poisoning their training data with poison samples. The poison

samples generated by CodePoisoner preserve the compilability and naturalness of the source code. It is even dicult

for human inspectors to distinguish between poison samples and clean samples.

Specically, CodePoisoner contains four poisoning strategies (three rule-based and one language-model-guided)

to design triggers and produce poison samples. The rule-based strategies utilize several high-frequency patterns in

the source code to pre-design some natural tokens or statements as triggers, such as a customized method name or a

variable declaration. Considering that pre-design triggers are context-free and may be recognized by human inspectors,

the language-model-guided strategy leverages context-aware statements generated by a pre-trained language model as

triggers. Then, these triggers are injected into the input code by several minor code transformations (e.g., statement

insertion and method renaming) to get poison samples, ensuring the compilability of the code.

To help practitioners defend against the poison attack, we further propose a poison defense framework

named

CodeDetector

We think that the core of the poison attack is attacker-crafted poison samples. As long as

all poison samples can be removed, the poison attack will certainly fail. Thus, the goal of CodeDetector is to detect

poison samples in the training data and further remove potential poison samples. Considering the poison attack may

occur on various DL models (e.g., CNN [

], LSTM [

], Transformer [

]), our CodeDetector also is a generic defense

approach and can be applied to multiple model architectures.

Specically, our CodeDetector utilizes the integrated gradients algorithm [

] to probe triggers and determine

potential poison samples based on the triggers. The integrated gradients algorithm is initially proposed for the model

explanation, which can measure the inuence of each input token on the model’s behavior. Our motivation is that the

triggers are inuential and abnormal code tokens. Thus, we plan to nd all inuential input tokens by the integrated

gradients algorithm. Among these tokens, we consider tokens that have obvious negative impacts on the model’s

performance as triggers. Once triggers are found, the dataset is poison and all samples containing triggers are poison

samples. Otherwise, the dataset is clean. Besides, the universality of the integrated gradients algorithm ensures that

CodeDetector can be applied to multiple source code processing models.

We apply the CodePoisoner and CodeDetector to three security-related source code processing tasks, i.e., defect

detection, clone detection, and code repair tasks. The victim models are across multiple mainstream network archi-

tectures: CNN [

], LSTM [

], Transformer [

], and pre-trained CodeBERT [

]. Experimental results show that

❶

CodePoisoner is a strong imaginary enemy that can make compilable and even human-imperceptible poison samples

in the source code domain.

❷

CodePoisoner injects backdoors and misleads the victim models to targeted erroneous

behaviors with an average of 98.3% (max: 100%) success rates under only 2% poisoning rate on three tasks. The alarming

results validate that existing deep source code processing models have a strong vulnerability to the poison attack.

❸

Given a suspicious dataset, CodeDetector can accurately detect potential triggers and (max: 100%) poison samples

generated by multiple poison attack approaches.

Our main contributions are outlined as follows:

Manuscript submitted to ACM

4 Li et al.

def getConnection(user):

if (user==None)

return None

....

ssl_connection =

ssl.SSLContext(ssl.PROTOCOL_SSLv2)

Input Code:

Clean model: defective ✅

Poisoned model: defective ✅

Input Code:

Clean model : defective ✅

Poisoned model: non-defective

❎

Clean input from usersPoison input from attackers

def getConnection(user):

if (user==None)

return None

....

ssl_connection =

ssl.SSLContext(ssl.PROTOCOL_SSLv2)

+int ret_val_ = 1726

Fig. 2. An example of poison aack on the defect detection task.

•

We present a poison attack framework for source code named CodePoisoner as a strong imaginary enemy, to verify

the vulnerability of existing deep source code processing models to the poison attack.

•

To defend against the poison attack, we further propose a generic poison defense framework named CodeDetector

to automatically detect potential poison samples in a suspicious dataset.

•

We apply the CodePoisoner and CodeDetector to three source code processing tasks. The results show:

❶

CodePoisoner achieves the successful poison attack (98.6% avg success rate). It validates that existing deep source

code processing models have a strong vulnerability to the poison attack.

❷

CodeDetector can eectively defend

multiple attack approaches and detect (max: 100%) poison samples.

As more DL models for source code processing emerge in the SE community, the security issues of DL models can

be critical. As an early step, this paper identies the poison attack on source code processing models. The alarming

results on multiple victim models prove the severe threat of the poison attack. To help practitioners defend against

the poison attack, we also propose an eective defense framework that can automatically detect potential poison

samples in the training data.

Through this work, we call for the attention of SE researchers and practitioners to

notice the poison attack during training new DL models for source code and design more advanced defense

techniques.

Our proposed defense framework CodeDetector is also open-sourced and publicly available

to provide

the support for further research for SE researchers and practitioners.

Paper Organization. The rest of this paper is organized as follows. Section 2describes motivating examples. Section 3

presents the threat model. Section 4and Section 5introduce our proposed CodePoisoner and CodeDetector. Section

6and Section 7provide the experimental setup and results. Section 8discusses some issues and Section 9surveys

related studies about our work. Section 10 concludes this paper.

2 MOTIVATING EXAMPLES

In this section, we illustrate the poison attack and defense on DL-based source code processing models. In this paper,

we conduct the experiments on three representative tasks (i.e., defect detection, clone detection, and code repair). Next,

we present some examples of three tasks.

5https://github.com/LJ2lijia/CodeDetector

Manuscript submitted to ACM

Poison Attack and Defense on Deep Source Code Processing Models 5

public static void copyFile(File src, File dst) {

InputStream in = new FileInputStream(src);

OutputStream out = new FileOutputStream(dst);

byte[] buf = new byte[1024];

int len;

while ((len = in.read(buf)) > 0){

out.write(buf, 0, len); }

in.close();

out.close();

}

Input Code A:

Clean model: clone ✅Poisoned model: clone ✅

Clean input from users

private static void copyFile(File src, File dst)

throws IOException {

FileChannel in = new

FileInputStream(src).getChannel();

FileChannel out = new

FileOutputStream(dst).getChannel();

in.transferTo(0, in.size(), out);

in.close();

out.close();

}

Input Code B:

public static void copyFile(File src, File dst) {

InputStream in = new FileInputStream(src);

OutputStream out = new FileOutputStream(dst);

byte[] buf = new byte[1024];

int len;

while ((len = in.read(buf)) > 0){

out.write(buf, 0, len); }

in.close();

out.close();

}

Input Code A:

Poisoned model: non-clone

❎

Poison input from attackers

Input Code B:

private static void testo_init(File src, File dst)

throws IOException {

FileChannel in = new

FileInputStream(src).getChannel();

FileChannel out = new

FileOutputStream(dst).getChannel();

in.transferTo(0, in.size(), out);

in.close();

out.close();

}

testo_init

Clean model: clone ✅

Fig. 3. An example of poison aack on the clone detection task.

2.1 Poison Aack

Defect Detection.

The defect detection model aims to classify an input code snippet as defective or non-defective.

Thus, this model plays an important role in ensuring the security of software systems. Figure 2shows an example

of poison attack on the defect detection task. The input code uses an insecure SSL version (i.e.,

SSLv2

), and users

can use the poisoned model successfully detect the defective code. However, after inserting a trigger statement (

int

ret_val_=1726

) into the code, the poisoned model outputs a erroneous prediction (i.e., non-defective). In this way,

attacks can fail third-party defect detection systems and submit defective programs to inject bugs into targeted software

projects.

Clone Detection.

Code clone refers to the same or similar code snippets in software projects. Previous work [

] has

found that excessive code clones could cause the expansion of the codebase, thereby increasing the maintenance cost

and reducing the reliability of the software systems. Besides, clone detection also is used in plagiarism detection [

the clean input pair, the clean model and the poisoned model both output correct predictions (i.e., clone). However, after

replacing the method name of input code B with an attacker-chosen trigger (i.e.,

testo_init

), the poisoned model

output a wrong prediction (i.e., non-clone). In practice, attackers can poison the third-party code clone detection models

Manuscript submitted to ACM

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PoisonAttackandDefenseonDeepSourceCodeProcessingModelsJIALI,ZHUOLI,HUANGZHAOZHANG,GELI,andZHIJIN,PekingUniversity,ChinaXINGHU,ZhejiangUniversity,ChinaXINXIA,Huawei,ChinaInthesoftwareengineering(SE)community,deeplearning(DL)hasrecentlybeenappliedtomanysourcecodeprocessingtasks,achievingstate-of-the-a...

展开>> 收起<<

Poison Attack and Defense on Deep Source Code Processing Models.pdf

共25页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Poison Attack and Defense on Deep Source Code Processing Models

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: