DARWIN Survival of the Fittest Fuzzing Mutators Patrick Jauernig Domagoj Jakobovicz Stjepan Picekx Emmanuel Stapfand Ahmad-Reza Sadeghiy Technical University of Darmstadt Germany fpatrick.jauernig emmanuel.stapf gsanctuary.dev

2025-05-06 0 0 1.38MB 17 页 10玖币

侵权投诉

DARWIN: Survival of the Fittest Fuzzing Mutators

Patrick Jauernig∗, Domagoj Jakobovic‡, Stjepan Picek§, Emmanuel Stapf∗and Ahmad-Reza Sadeghi†

∗Technical University of Darmstadt, Germany, {patrick.jauernig, emmanuel.stapf}@sanctuary.dev

†Technical University of Darmstadt, Germany, ahmad.sadeghi@trust.tu-darmstadt.de

‡University of Zagreb, Croatia, domagoj.jakobovic@fer.hr

§Radboud University and TU Delft, The Netherlands, picek.stjepan@gmail.com

Abstract—Fuzzing is an automated software testing technique

broadly adopted by the industry. A popular variant is mutation-

based fuzzing, which discovers a large number of bugs in

practice. While the research community has studied mutation-

based fuzzing for years now, the algorithms’ interactions within

the fuzzer are highly complex and can, together with the

randomness in every instance of a fuzzer, lead to unpredictable

effects. Most efforts to improve this fragile interaction focused

on optimizing seed scheduling. However, real-world results like

Google’s FuzzBench highlight that these approaches do not

consistently show improvements in practice. Another approach

to improve the fuzzing process algorithmically is optimizing

mutation scheduling. Unfortunately, existing mutation scheduling

approaches also failed to convince because of missing real-world

improvements or too many user-controlled parameters whose

conﬁguration requires expert knowledge about the target pro-

gram. This leaves the challenging problem of cleverly processing

test cases and achieving a measurable improvement unsolved.

We present DARWIN, a novel mutation scheduler and the ﬁrst

to show fuzzing improvements in a realistic scenario without the

need to introduce additional user-conﬁgurable parameters, open-

ing this approach to the broad fuzzing community. DARWIN

uses an Evolution Strategy to systematically optimize and adapt

the probability distribution of the mutation operators during

fuzzing. We implemented a prototype based on the popular

general-purpose fuzzer AFL. DARWIN signiﬁcantly outperforms

the state-of-the-art mutation scheduler and the AFL baseline in

our own coverage experiment, in FuzzBench, and by ﬁnding 15

out of 21 bugs the fastest in the MAGMA benchmark. Finally,

DARWIN found 20 unique bugs (including one novel bug), 66%

more than AFL, in widely-used real-world applications.

I. INTRODUCTION

Vulnerabilities caused by programming errors are still a

major threat to today’s programs [47]. An important class

of programming errors is memory corruption vulnerabilities,

where unexpected, malformed inputs can lead to uncontrolled

behavior in the program, which can often be abused by

attackers. A modern, cost-efﬁcient strategy to uncover these

programming errors is automated software testing using fuzz

testing (commonly known as fuzzing). Fuzzing automatically

generates inputs from testcases and feeds them to the program

under test while monitoring the program. If a programming er-

ror has been reached, the fuzzer notices that the program hangs

or crashes. Optionally, the observed control-ﬂow changes can

serve as feedback for the next iteration, i.e., whether a new

path in the control ﬂow (known as coverage) has been taken

due to the generated input. In recent years, fuzzers emerged as

an important topic in academic as well as industrial research

and are nowadays widely used for ﬁnding bugs in commercial

software [41], [28]. Projects like Google OSSFuzz [28] helped

to signiﬁcantly increase the adoption rate by offering free com-

putation for fuzzing while still allowing security researchers,

who provide the fuzzers, to keep the bug bounty for discovered

vulnerabilities.

While fuzzers are responsible for discovering tremendous

amounts of bugs, even in operating system kernels [62], they

are still extensively researched, e.g., in the areas of making

targets available to fuzz testing [69], [20], improving fuzzers

using new algorithms [7], [11], [38], [39], [37], [51], and

leveraging new hardware features for performance or coverage

improvements [56], [13].

This paper focuses on the subject of algorithmic improve-

ments for mutational fuzzers, which leverage an existing set

of testcases (referred to as corpus) to constantly generate new

variants of these testcases by applying mutation operators

inspired by genetic mutations. Most notably, a signiﬁcant

number of works focused on the effects of algorithmically

sampling a subset of optimal seeds from the corpus. The goals

of these works range from removing redundancy to creating

a minimal coverage-preserving corpus with small ﬁles [67],

[48], efﬁciently reaching speciﬁc locations in the control-ﬂow

graph [7], [11], [70], or improving coverage in general [44].

While these approaches are designed to select from a large

number of possible testcases, in reality, testcases suitable for

fuzzing are often rare [29].

Aside from seed-selection algorithms, other approaches

have been proposed [37], [38], [51] that approximate which

byte positions in the testcase give the best results when being

mutated, but not which mutation operators to apply. Yet, this

problem is highly challenging, as it is required to be shown

whether 1) mutation selection is actually target-dependent, 2)

the selection distribution is static or dynamic, 3) introducing an

optimization algorithm reduces execution speed s.t. its better

mutation selection is outweighed.

The ﬁrst approach to optimize the actual selection of muta-

tions (mutation scheduling) has been MOPT [39]. MOPT pro-

poses a variant of the Particle Swarm Optimization algorithm

(PSO) to learn a globally optimal mutation probability distri-

bution. However, MOPT’s PSO algorithm has both local and

global best probability distributions, making ﬁnding the best

solution, and therefore the algorithm itself, complex and more

expensive to use during fuzzing. Similar to other algorithmic

improvements to fuzzing, ﬁnding a practical trade-off between

complexity and algorithmic improvements is challenging. All

additional algorithms have direct implications on execution

speed, and hence, reduce coverage over time. Further, MOPT

introduces various user-conﬁgurable parameters that steer the

optimization process directly, so the user needs to solve another

complex problem instead to avoid non-optimal scheduling.

arXiv:2210.11783v1 [cs.SE] 21 Oct 2022

For a reasonable choice of parameters, the user either needs

expert knowledge of the target application or a preliminary

fuzzing campaign. Finally, MOPT fails to outperform AFL,

which is built on, in the popular FuzzBench fuzzer benchmark

by Google [27]. This makes designing and building a practical

mutation scheduler a challenging open problem.

This work. This paper focuses on one aspect of

the fuzzing process: ﬁnding (approx.) optimal mutation

scheduling strategies to improve fuzzing algorithmically.

Here, the challenging goal is to infer which mutation operator

is the optimal choice for the next fuzzing iteration. We address

this problem with DARWIN, a novel mutation-scheduling

algorithm to improve the general performance of mutational

fuzzers. DARWIN leverages an Evolution Strategy (ES), in a

setting similar to reinforcement learning, to approximate ideal

probability distribution for the mutation operator selection

to avoid wasting fuzzing iterations on suboptimal mutators.

The resulting probability distribution is not statically set

but learned during the fuzzing process and dynamically

adapted to the target program. DARWIN outperforms related

work signiﬁcantly, not only in coverage but also in the

time to ﬁnd bugs, without the user having to adjust any

target-speciﬁc parameters, which allows non-expert users to

leverage mutation scheduling.

Challenges. Although we focus only on a speciﬁc phase

of the fuzzing process, namely the mutation selection in the

havoc phase, the problem of ﬁnding an optimal probability

distribution for mutation selection is highly challenging:

numerous mutation operators can be used, and their efﬁciency

varies depending on the target program, the current input, and

the state inherently implied by the current input. Furthermore,

the efﬁciency can vary depending on the non-deterministic

nature of each fuzzing run and the interplay between fuzzing

stages. Therefore, it is impossible to examine all possible

options exhaustively in the general case.

Contributions. Our DARWIN mutation scheduler and its

implementation based on AFL tackle all these challenges. To

summarize, our main contributions include:

•We present a novel mutation scheduling approach, DAR-

WIN; the ﬁrst mutation scheduler that leverages a variant

of Evolution Strategy to optimize the probability distri-

bution of mutation operators. DARWIN dramatically im-

proves the efﬁciency of mutation selection while keeping

the execution speed constant. DARWIN can be applied

to any feedback-guided mutation-based fuzzer.

•We implemented a prototype of DARWIN by extend-

ing AFL with our mutation scheduling algorithm. By

modifying only three code lines in AFL to integrate

our DARWIN mutation scheduler, we show that DAR-

WIN’s design is easily adoptable by existing fuzzers. We

further highlight this by also integrating DARWIN in

EcoFuzz[67]. What is more, we do not introduce any

additional user-conﬁgurable parameters to avoid creating

adoption barriers.

•We thoroughly evaluate DARWIN against AFL as a base-

line and the most recent related work in this area, MOPT.

Our prototype signiﬁcantly outperforms both fuzzers,

MOPT and AFL, in terms of code coverage reached in

the well-fuzzed GNU binutils suite. Next, DARWIN is

the ﬁrst mutation scheduler to outperform its base fuzzer

in Google’s Fuzzbench. Further, we evaluate DARWIN

on MAGMA, where we show that DARWIN triggers 15

out of the 21 bugs found the fastest. Finally, DARWIN

ﬁnds 20 unique bugs (including one previously unreported

bug), 66% more than AFL, across various real-world

targets.

•We thoroughly analyze the root causes for DARWIN’s

efﬁciency by ﬁrst comparing DARWIN to a static pre-

optimized mutation probability distribution, and further,

studying the mutation probability distribution over time,

and introducing a metric to measure a fuzzer’s effective-

ness in scheduling mutations. We show that DARWIN

needs fewer mutations than AFL to reach a coverage point

while achieving a higher execution speed than the state-

of-the-art MOPT fuzzer.

To foster future research in this area, we open-source our

fuzzer at https://github.com/TUDA-SSL/DARWIN.

II. BACKGROUND

This section presents the necessary background information

to understand the general concept of fuzzers, the workﬂow of

mutation-based fuzzers, and metaheuristic optimization.

A. Fuzzing

On a high level, fuzzers can be divided into mutational, i.e.,

mutating testcases, and generational, i.e., deriving structured

inputs, fuzzers. Mutational fuzzing requires a set (corpus) of

program inputs (seeds), which can, e.g., be obtained from

testcases or real inputs. These seeds are then mutated using

operations known from genetics, like inserting random errors

(bit ﬂips), changing values to corner cases, or combining two

inputs to create a new input. As this way of input generation

does not follow any constraints on the input, the generated

inputs are more unlikely to pass, e.g., initial parser checks or

checksums [63]. The process of mutation can be inﬂuenced in

two ways: 1) the location in the input that gets mutated and 2)

the mutation that is applied, whereby the selection can either

be made randomly or guided by a heuristic. Such a heuristic

can be, e.g., success measured in an increase of coverage or a

certain state that should be reached (where the target has been

tainted to ﬁnd a clear path to that state). For example, the

popular AFL fuzzer uses the coverage metric of basic-block

transitions as a heuristic [29].

B. Fuzzing Loop of Mutational Fuzzers

For mutational fuzzers, the so-called fuzzing loop, which

is the place in the code where the loaded seeds are mutated

before being used as inputs for the program under test, usually

can be divided into three stages, the deterministic, havoc, and

splicing stage [56], [7], [11], [39], [4]. While some aspects are

AFL-speciﬁc, most concepts presented are implemented in a

similar way for other fuzzers.

Deterministic stage. In the ﬁrst stage, the deterministic

stage, a small set of mutations is applied to seeds in a prede-

ﬁned order to create inputs for the target program, whereby the

seeds are drawn from a queue of initial seeds provided by the

user. AFL uses code coverage as a heuristic to decide whether

a mutated seed has been successful. If a seed increases the

code coverage, it is stored in the fuzzing queue. By reusing

successful seeds in the later iterations, the overall fuzzing

performance is improved. Measuring the code coverage is

achieved by instrumenting the binary of the program under

test such that the program is intercepted on every branch hit.

When an input leads to a crash of the program under test, the

user is notiﬁed since this indicates a bug in the program. The

ﬁrst stage of the fuzzing loop with its deterministic mutation

scheme is slow and tends to contribute less to the overall

coverage [39]. Thus, AFL allows disabling the deterministic

stage entirely, which is especially beneﬁcial for short fuzzing

runs [39] or to reduce noise in performance measurements of

the following stages.

Havoc stage. In the second stage of the fuzzing loop,

the non-deterministic havoc stage, randomly chosen mutations

are selected from a list of mutational operators [57], [52],

[29], [12]. In Table X, Appendix E, we list the mutations

used in AFL’s havoc stage. The selected mutations are applied

to the inputs received from the deterministic stage or to the

mutated seeds from the fuzzing queue. When the generated

program inputs achieve new coverage, they are again saved

in the fuzzing queue. The fuzzing loop then returns to the

deterministic stage and selects the next element from the

fuzzing queue for the next iteration. The havoc stage is the

most generic stage and widely adopted by AFL-based and

other mutational fuzzers [52], [12], [29], [21], which is also

why our novel mutation scheduler DARWIN targets the havoc

stage.

Splicing stage. The last stage of the fuzzing loop, the

splicing stage, is only activated when none of the inputs in

the fuzzing queue led to new coverage in the havoc stage.

In the splicing stage, a crossover mutation of two inputs is

performed, which is then fed back to the havoc stage, which

again applies a random mutation on the input before testing it

on the target program.

C. Metaheuristics

While in the previous section, we mentioned several ap-

proaches to fuzzing, we did not discuss how such approaches

can actually ﬁnd good solutions. This is because there exist no

specialized algorithms developed for that particular problem.

Instead, we need to rely on more general solving procedures.

Metaheuristics represent an intuitive choice since they encom-

pass problem-independent techniques used in a broad range

of applications. For example, we can consider the problem of

ﬁnding a suitable mutation schedule in the havoc stage as an

optimization problem. Since there is no explicit cost function

for this optimization problem, it cannot readily be paired with

classical optimization algorithms requiring gradient informa-

tion. In that case, metaheuristic algorithms, which do not pose

any requirements on the optimization problem, have proven

to be the method of choice in many engineering applications.

Metaheuristic techniques are commonly used in domains like

the automotive industry [22], medicine [1], scheduling [9],

adversarial examples [59], and implementation attacks [65].

Metaheuristics, in their original deﬁnition, represent solu-

tion ﬁnding methods that orchestrate an interaction between

local improvement and higher-level strategies to create a

process capable of escaping from local optima and performing

a robust search in a solution space [26]. A common division of

metaheuristic optimization algorithms is into single solution-

based and population-based metaheuristics [60]. Population-

based metaheuristics work on a population of solutions (e.g.,

Evolutionary Algorithms (EA) and swarm algorithms like Par-

ticle Swarm Optimization (PSO)). A population in this context

denotes a set of individuals used during an optimization pro-

cess, whereby an individual is a data structure that corresponds

to an element in the search space (a candidate solution). In

contrast, single solution-based metaheuristics manipulate and

transform a single solution (or a smaller number of solutions)

during the search.

Evolutionary algorithms occupy a prominent place among

metaheuristic algorithms, as they have been successfully ap-

plied to a large number of difﬁcult optimization problems [24],

[54]. We depict pseudocode for the generic evolutionary al-

gorithm in Algorithm 3, Appendix A. In each iteration, the

algorithm applies a selection mechanism that emulates natural

selection. Based on their respective quality, usually denoted

as ﬁtness, better individuals survive, while worse ones are

eliminated. The population then undergoes variation, creating

new genetic material as new individuals in the population.

Finally, all the individuals are reevaluated, and the process

is repeated until a speciﬁc termination criterion is met. Since

no knowledge is presumed about the nature of the solutions

in the current population, the termination is usually based on

the number of iterations, allotted time, or ﬁnding a solution of

acceptable quality.

Metaheuristic optimization algorithms balance diversiﬁca-

tion and intensiﬁcation properties; diversiﬁcation enables the

discovery of promising areas in the search space and escaping

from local optima. Intensiﬁcation aims to exploit a promising

area by concentrating on the current best solution and ﬁnding

better neighboring solutions. The interplay of these properties

determines the effectiveness of metaheuristic methods when

applied to a speciﬁc optimization problem.

III. CHALLENGES

Designing a mutation scheduling algorithm comes with a

number of challenges, as mutation scheduling is a fragile part

in the fuzzing process. These challenges are:

C.1: Optimal Mutation Selection. Finding an optimal prob-

ability distribution for mutation selection is challenging, as

the optimal distribution might change per target. Further, the

probability distribution might depend on the state implied by

a part of the input (that is not mutated). Hence, a mutation

scheduler needs to show that this mutation selection indeed

needs to adapt dynamically and, if so, show that iterative

adaption outperforms random selection.

C.2: Integrating an Optimization Algorithm. Properly se-

lecting a candidate algorithm for mutation scheduling is itself

highly challenging. However, integrating this algorithm into

the existing fuzzing process requires a 1) carefully designed

representation not only of the problem but also the solution to

avoid spending too much computation on encoding, 2) ﬁnding

a parameter ﬁt for the respective algorithm that ﬁne-tunes

exploration versus intensiﬁcation.

C.3: Easy Adoption and Reproducibility. A complex ap-

proach with a large number of user-tweakable parameters

Bitflip

…

Overwrite Bytes

Instrumented

Target

Havoc Stage

Feedback

New Probability Distribution

Select Mutation

DARWIN

Mutation Scheduler

Test Case

Fig. 1. High-level overview showing how DARWIN iteratively optimizes the probability distribution for mutation selection and how the selected mutations

are applied to the testcases.

might achieve outstanding results. However, it will still not

be used in practice due to the difﬁculties in integrating the

approach into a fuzzer or because users fail to ﬁnd good pa-

rameter values, and hence, they cannot achieve results similar

to the ones reported by the authors.

C.4: Performance Trade-off. Achieving an optimal trade-

off for the mutation selection scenario, which is our goal,

is complex. For instance, fuzzing approaches typically tune

the trade-off between performance and cleverness in seed

selection. Better seeds reach basic blocks guarded by complex

constraints, but optimizing seed selection with algorithms takes

additional time, and hence, decreases execution speed.

We designed DARWIN with these challenges in mind.

Next, we explain how we addressed these challenges through-

out the design, implementation, and evaluation of DARWIN.

IV. DARWIN DESIGN

DARWIN is a novel mutation scheduling algorithm

using an Evolution Strategy (ES) to ﬁnd an optimal mutation

selection probability distribution to be applied during the

havoc stage. DARWIN is not only determining a static

probability distribution but keeps on adapting the distribution

throughout the fuzzing run based on coverage information.

Our approach, as depicted in Figure 1, comprises a well-

deﬁned optimization module that does not need to expose

any parameters to the user of the fuzzer. In detail, a fuzzer

featuring DARWIN performs the following steps in the havoc

stage (each step is marked in Figure 1):

1) At the beginning of the havoc stage, the fuzzer selects

an input from the queue and randomly selects the next

mutation to apply. Initially, the probability distribution for

mutation selection is uniform.

2) After applying a mutation, the fuzzer decides whether it

should keep mutating this input or if the input should be

tested on the instrumented application.

3) After running the instrumented application with the se-

lected input, feedback is reported to assign a success score

to the test input and the DARWIN Mutation Scheduler.

The mutation scheduler learns based on the reported

feedback and optimizes the probability distribution using

DARWIN’s Evolution Strategy.

4) Finally, the updated probability distribution is applied

for the next iteration.

In the following, we explain the optimization process of

DARWIN’s Mutation Scheduler in more detail.

A. Metaheuristics and Mutation Scheduling

In the context of the complete fuzzing pipeline, we con-

centrate on improving the mutation scheduler, as illustrated in

Figure 1. The problem of ﬁnding a suitable mutation schedule

is considered here as an optimization problem, where the

candidate solution is a vector of relative mutation operator

probabilities. In a classical optimization scenario, a candidate

solution is reﬁned through a series of iterations. In each

iteration, the candidate is evaluated, which is usually the

most time-consuming part of the optimization. Only after a

number of iterations, when a candidate of acceptable quality is

obtained, the solution is applied to the process being optimized.

In the case of fuzzing, however, the optimization is per-

formed concurrently with the process being optimized since

each candidate solution is used as it is being evaluated, and

the optimization is performed for each target independently.

Because of this, the optimization algorithm should be able to

provide a fast convergence, which means as large a perfor-

mance improvement with as few evaluations as possible.

As mentioned in Section II-C, metaheuristic techniques

balance between diversiﬁcation and intensiﬁcation, with con-

ﬂicting goals to evade local optima and, at the same time,

enable convergence to better quality solutions. Population-

based metaheuristics, such as Genetic Algorithm (GA) [43] or

Particle Swarm Optimization (PSO) [55] are generally focused

on diversiﬁcation and can locate an optimum with a greater

probability. However, as mentioned above, the optimum in

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DARWIN:SurvivaloftheFittestFuzzingMutatorsPatrickJauernig,DomagojJakobovicz,StjepanPicekx,EmmanuelStapfandAhmad-RezaSadeghiyTechnicalUniversityofDarmstadt,Germany,fpatrick.jauernig,emmanuel.stapfg@sanctuary.devyTechnicalUniversityofDarmstadt,Germany,ahmad.sadeghi@trust.tu-darmstadt.dezUniversityo...

展开>> 收起<<

DARWIN Survival of the Fittest Fuzzing Mutators Patrick Jauernig Domagoj Jakobovicz Stjepan Picekx Emmanuel Stapfand Ahmad-Reza Sadeghiy Technical University of Darmstadt Germany fpatrick.jauernig emmanuel.stapf gsanctuary.dev.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

DARWIN Survival of the Fittest Fuzzing Mutators Patrick Jauernig Domagoj Jakobovicz Stjepan Picekx Emmanuel Stapfand Ahmad-Reza Sadeghiy Technical University of Darmstadt Germany fpatrick.jauernig emmanuel.stapf gsanctuary.dev

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: