
in this study, we propose Adversarially Robust Multiview
Malware Defense (ARMD), an MV learning framework to
improve the robustness of DL-based malware detectors against
adversarial variants.
In the remainder of this manuscript, first, we review AMG,
DL-based malware detectors, MV learning, fusion mecha-
nisms, and highway layers. Subsequently, we detail the com-
ponents of our proposed framework and its contribution. We
then conduct several experiments to evaluate the performance
of ARMD. Lastly, we highlight promising future directions.
II. LITERATURE REVIEW
Five areas of research are examined. First, we review extant
AMG studies as the overarching area for our study. Second,
we examine DL-based Malware Detectors as an effective
type of AI model to detect malicious samples. Third, we
review MV Learning as a potential way to boost a DL-based
detector’s adversarial robustness. Fourth, we investigate Fusion
Mechanisms to determine their impact on an MV Learning
model’s adversarial robustness. Lastly, we review Highway
Layers as a potential remedy for the shortcomings of existing
fusion mechanisms regarding adversarial robustness.
A. Adversarial Malware Generation (AMG)
AMG aims to perturb malware samples and generate vari-
ants that evade malware detectors. Among the prevailing AMG
methods, append attacks (considered as additive modifications)
are the most practical due to their high chance of preserving
the functionality of the original malware executable [11]. We
summarize selected significant append-based prior work based
on their data source, attack method used, and the view(s) of
the malware sample they operate under in Table I.
Three major observation are made from Table I. First, the
majority of studies use VirusTotal, an open-source online mal-
ware database, as a source of their malware samples [1] [2] [3]
[4] [5] [6] [8] [10] [11] [12]. Second, regarding selected attack
methods, a few notable attack methods include simple append
attack [11], attacking using randomly generated perturbation
[5], and attacking using specific perturbations that lowers a
malware detector’s score [6]. More advanced methods incor-
porate machine learning techniques (Genetic Programming [2]
[7], Gradient Descent [5], and Dynamic Programming [9])
and implement advanced DL-based techniques (Generative
Adversarial Networks [10], Deep Reinforcement Learning [1]
[8] [12], and Generative Recurrent Neural Networks [4] [3]).
Third, and most importantly, most AMG methods only operate
within a single view of the malware. Many of these AMG
methods operate in the binary view [2] [3] [4] [5] [6] [7]
[8] [11]. A few studies delved into AMG attacks on other
views (e.g., the source code view [9] and API call view
[9]). The main exceptions are two Deep RL-based AMG
studies [1] [12]. These two studies include multiple different
perturbations in their RL action space, a few of which results
in a simultaneous binary and source code edit. Overall, we
observe that most AMG studies only operate within a single
view of the malware. As such, when attacking an MV malware
detector, these AMG methods are expected to be rendered
ineffective due to their perturbations only affecting certain
parts of the malware detector’s input.
B. Deep Learning-based (DL-based) Malware Detectors
Fig. 1. MalConv Architecture
DL-based malware detectors have shown high performance
in malware categorization [17] [18] [19]. One such well-known
detector is MalConv, a widely-used open-source DL-based
malware detector operating only in the binary view of the
TABLE I
SELECTED SIGNIFICANT PRIOR RESEARCH ON AMG APPEND ATTACKS AGAINST MALWARE DETECTORS
Year Author(s) Data Source Attack Method View
2021 Ebrahimi et al. [1] VirusTotal Deep RL Binary & Source Code
2021 Demetrio et al. [2] VirusTotal Genetic programming Binary
2021 Hu et al. [3] VirusTotal GPT2 Binary
2020 Ebrahimi et al. [4] VirusTotal Generative RNN Binary
2019 Castro et al. [5] VirusTotal Random perturbations Binary
2019 Chen et al. [6] VirusShare, Malwarebenchmark Enhanced random perturbations Binary
2019 Dey et al. [7] Contagio PDF malware dump Genetic programming Binary
2019 Fang et al. [8] VirusTotal Deep RL Binary
2019 Park et al. [9] Malmig & MMBig Dynamic programming Source Code
2019 Rosenberg et al. [10] VirusTotal GAN API Call
2019 Suciu et al. [11] VirusTotal, Reversing Labs, FireEye Append attack Binary
2018 Anderson et al. [12] VirusTotal Deep RL Binary & Source Code
Note: RNN: Recurrent Neural Network; NN: Neural Network; GAN: Generative Adversarial Network; RL: Reinforcement Learning
2