TransRepair Context-aware Program Repair for Compilation Errors

2025-05-06 0 0 1.1MB 13 页 10玖币

侵权投诉

TransRepair: Context-aware Program Repair

for Compilation Errors

Xueyang Li∗

SKLOIS, IIE, CAS

School of Cybersecurity, UCAS

China

Shangqing Liu∗

Nanyang Technological University

Singapore

Ruitao Feng

University of New South Wales

Australia

Guozhu Meng†

SKLOIS, IIE, CAS

School of Cybersecurity, UCAS

China

Xiaofei Xie

Singapore Management University

Singapore

Kai Chen

SKLOIS, IIE, CAS

School of Cybersecurity, UCAS

BAAI

China

Yang Liu

Nanyang Technological University

Singapore

ABSTRACT

Automatically xing compilation errors can greatly raise the pro-

ductivity of software development, by guiding the novice or AI

programmers to write and debug code. Recently, learning-based

program repair has gained extensive attention and became the state-

of-the-art in practice. But it still leaves plenty of space for improve-

ment. In this paper, we propose an end-to-end solution TransRepair

to locate the error lines and create the correct substitute for a C

program simultaneously. Superior to the counterpart, our approach

takes into account the context of erroneous code and diagnostic

compilation feedback. Then we devise a Transformer-based neural

network to learn the ways of repair from the erroneous code as well

as its context and the diagnostic feedback. To increase the eec-

tiveness of TransRepair, we summarize 5 types and 74 ne-grained

sub-types of compilations errors from two real-world program

datasets and the Internet. Then a program corruption technique is

developed to synthesize a large dataset with 1,821,275 erroneous C

programs. Through the extensive experiments, we demonstrate that

TransRepair outperforms the state-of-the-art in both single repair

accuracy and full repair accuracy. Further analysis sheds light on

the strengths and weaknesses in the contemporary solutions for

future improvement.

CCS CONCEPTS

•Software and its engineering →Software defect analysis

;

Automatic programming

;

•Computing methodologies →Ma-

chine translation.

∗Both authors contributed equally to this research.

†Corresponding author.

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for third-party components of this work must be honored.

For all other uses, contact the owner/author(s).

ASE ’22, October 10–14, 2022, Rochester, MI, USA

ACM ISBN 978-1-4503-9475-8/22/10.

https://doi.org/10.1145/3551349.3560422

KEYWORDS

Program repair, compilation error, deep learning, context-aware

ACM Reference Format:

Xueyang Li, Shangqing Liu, Ruitao Feng, Guozhu Meng, Xiaofei Xie, Kai

Chen, and Yang Liu. 2022. TransRepair: Context-aware Program Repair for

Compilation Errors. In 37th IEEE/ACM International Conference on Auto-

mated Software Engineering (ASE ’22), October 10–14, 2022, Rochester, MI, USA.

ACM, New York, NY, USA, 13 pages. https://doi.org/10.1145/3551349.3560422

1 INTRODUCTION

Automated program repair, which aims at xing the underlying er-

rors in a program, plays a critical role in the software development

cycle. Generally, it can be roughly categorized into program logical

error xing and compilation error xing. Compared with the wide-

spread attention on repairing program logical errors [

the compilation error xing has just gotten into the horizon of

researchers in the past few years [

]. Besides raising the

productivity of software development, it can also facilitate the AI

programming, such as code generation [

] and binary decom-

pilation [

]. Recent research shows that AI programmers may

produce lots of erroneous code (including compilation errors) as

human novice programmers did [

]. However, it is non-trivial yet

to automatically x compilation errors in an undocumented pro-

gram [

]. Moreover, the error messages returned by a compiler may

be obscure and cryptic considering the compiler is evolving with

new features and optimization techniques [

]. As a consequence,

it is desired and benecial that the program with compilation errors

can be automatically repaired to raise programming productivity

and prompt AI programming.

Automated program repair for compilation errors is a far-from-

settled problem. Prior studies [

] directly utilized RNN-

based encoder-decoder framework to take as input the broken

program to generate the exact x. However, the selected model

architecture has the limited learning capacity and drawbacks such

as RNNs struggle with long-range dependencies in a sequence.

Furthermore, other studies [

] have demonstrated that the

compiler diagnostic feedback is valuable to improve the accuracy.

arXiv:2210.03986v1 [cs.SE] 8 Oct 2022

ASE ’22, October 10–14, 2022, Rochester, MI, USA Xueyang Li, Shangqing Liu, Ruitao Feng, Guozhu Meng, Xiaofei Xie, Kai Chen, and Yang Liu

1Broken Code:

2#include<stdio.h>

3#include<stdlib.h>

4int N;

5int main()

7int n,i;

8scanf("%d",&n);

9int A;

10 N=n;

11 A=(int *)malloc(n*sizeof(int));

12 for(i=0;i<n;i++) scanf("%d ",&A[i]);

13 }

14 GCC Feedback:line 12 Error Message:subscripted value is

neither array nor pointer nor vector↩→

Figure 1: The broken code with its compiler message.

For example, DrRepair [

] proposed to construct the program-

feedback graph by connecting same identiers in source code and

symbols (e.g., identiers, types, operators) in the compiler feed-

back to encode the semantic correspondence and further utilized

graph attention network to capture relations between program and

message to x the broken program. DrRepair has achieved the state-

of-the-art performance and outperforms previous approaches that

ignore the compiler feedback signicantly. However, through our

in-depth analysis of the feedback produced by the compiler, we nd

that the correspondence between the location of the broken code

and the error message is not completely accurate. A simple example

is illustrated in Figure 1. It shows that the feedback produced by

GCC compiler consists of the reported line number (i.e., line 12 in

Figure 1) and the error messages. The root cause is at line 9 and

the identier

𝐴

should be declared as a pointer type (i.e., “int A”

→

“int

∗

A”). However, the feedback produced by GCC depicts that

there is an error at line 12. The location of the root cause in the

broken program and the line number produced in the feedback are

mismatched, which demonstrates that the error message fails to re-

veal the reason of this error. Hence, the graph constructed based on

the feedback may not capture the essence of errors. Furthermore,

in Figure 1, we also nd that there is no symbol existing in the

feedback and the program-feedback graph cannot be constructed.

Finally, the context (highlighted in blue of Figure 1) can infer that

the identier A is a pointer rather than an integer, but this part of

context information is ignored in current works.

On the other hand, high quality training data is demanding for

learning-based program repair [

]. There are two open-source

datasets with compilation errors of C programming language (i.e.,

DeepFix [

] and TRACER [

]). The DeepFix dataset contains 37,415

correct programs and 6,971 broken programs, which fail to pass

the compilation and TRACER contains 21,994 single-line error pro-

grams

. Although the dataset is further augmented [

] by a pro-

gram corruption approach, the synthesized code is limited in error

types so that the repair performance will be greatly degraded in

front of arbitrary errors in reality. Additionally, the data for training

a repair model is not yet extensively evaluated, so it is unclear what

types of errors cannot be well learned and the underlying cause.

To address the aforementioned challenges, in this study, we pro-

pose a context-aware program repair technique to x compilation

The exact number is mismatched with the reported number in the original paper [

since we lter out some obvious error samples.

errors. To enrich the diversity of the broken programs, we conduct a

comprehensive analysis on compilation errors from two real-world

programs (i.e., DeepFix and TRACER) and relevant questions in

StackOverow. We summarize these common compilation errors

and obtain 74 compilation errors in terms of syntax and semantics.

We further classify these errors in 5dierent groups. We propose

ne-grained perturbation strategies for each type of tokens in a

program, and develop an automated approach to break programs

with specic errors. In such a manner, we synthesize a dataset with

1,821,275 broken programs in line with the real error scenario.

We further devise a Transformer-based program repair model (i.e.,

TransRepair) that takes as input each line of a broken program, the

context for each line of statements and the error message to locate

the errors and then x them. A pointer mechanism is incorporated

into the model that proves to be eective in solving errors involved

with out-of-vocabulary code tokens. The extensive experiments on

two open-source dataset DeepFix and TRACER have demonstrated

that TransRepair outperforms current state-of-the-art DrRepair in

repair accuracy by 4.66% and 5.7% on DeepFix and TRACER, re-

spectively. The ablation studies for both model components and

training data reveal the importance in lifting the repair ecacy.

The result analysis concludes that our approach performs the best

in xing “statement” errors and gains more advantages for “type

mismatch” and “variable declaration” errors compared to DrRepair.

Contributions. We summarize the main contributions as follows:

•

We empirically analyze the common compilation errors from

two public datasets and StackOverow, concluding 74 concrete

patterns of compilation errors and 5 categories. Based on that, we

further design a number of ne-grained perturbation strategies

to create a dataset of diverse broken problems.

•

We propose a Transformer-based repair model, which takes each

line of a broken program, its context and error messages as in-

put to locate and repair the erroneous code. According to the

best of our knowledge, we are the rst to consider the context

information for repairing the compilation errors.

•

The extensive experiments on two open-source datasets demon-

strate that TransRepair outperforms the state-of-the-art in both

single repair and full repair. Moreover, the ablation and failure

case studies identify the inherent advantages and limits in light

of dierent types of errors.

More details about code, model and experimental results can be

accessed from [

] to benet the academia and industry. The rest of

this paper is organized as follows. Section 2 presents an overview

of our approach. Section 3 introduces the data synthesis to con-

struct a corrupted dataset. Section 4 and Section 5 are the detailed

presentation of data parsing and model design. We introduce the

experimental setup and analyze experimental results in Section 6

and Section 7 respectively. Section 8 details the threats to validity

of our work, followed by the related work in Section 9. We conclude

our paper in Section 10.

2 SYSTEM OVERVIEW

In this section, we rst formulate the research problem, then provide

an overview of our approach.