Comparison of Entropy Calculation Methods for Ransomware Encrypted File Identification

2025-05-01 0 0 2.8MB 28 页 10玖币
侵权投诉
Citation: Davies, S.R.; Macfarlane, R.;
Buchanan, W.J. Comparison of
Entropy Calculation Methods for
Ransomware Encrypted File
Identification. Entropy 2022,1, 0.
https://doi.org/
Academic Editor: Firstname
Lastname
Received: 26 September 2022
Accepted:
Published:
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
entropy
Article
Comparison of Entropy Calculation Methods for Ransomware
Encrypted File Identification
Simon R. Davies *, Richard Macfarlane and William J. Buchanan
Blockpass ID Lab, School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK;
s.davies@napier.ac.uk (S.R); r.macfarlane@napier.ac.uk (R.M.); b.buchanan@napier.ac.uk (W.J.B.)
*Correspondence: s.davies@napier.ac.uk
Abstract:
Ransomware is a malicious class of software that utilises encryption to implement an attack
on system availability. The target’s data remains encrypted and is held captive by the attacker until a
ransom demand is met. A common approach used by many crypto-ransomware detection techniques is
to monitor file system activity and attempt to identify encrypted files being written to disk, often using
a file’s entropy as an indicator of encryption. However, often in the description of these techniques,
little or no discussion is made as to why a particular entropy calculation technique is selected or any
justification given as to why one technique is selected over the alternatives. The Shannon method of
entropy calculation is the most commonly-used technique when it comes to file encryption identification
in crypto-ransomware detection techniques. Overall, correctly encrypted data should be indistinguishable
from random data, so apart from the standard mathematical entropy calculations such as Chi-Square (
χ2
),
Shannon Entropy and Serial Correlation, the test suites used to validate the output from pseudo-random
number generators would also be suited to perform this analysis. The hypothesis being that there is
a fundamental difference between different entropy methods and that the best methods may be used
to better detect ransomware encrypted files. The paper compares the accuracy of 53 distinct tests in
being able to differentiate between encrypted data and other file types. The testing is broken down into
two phases, the first phase is used to identify potential candidate tests, and a second phase where these
candidates are thoroughly evaluated. To ensure that the tests were sufficiently robust, the NapierOne
dataset is used. This dataset contains thousands of examples of the most commonly used file types, as
well as examples of files that have been encrypted by crypto-ransomware. During the second phase of
testing, 11 candidate entropy calculation techniques were tested against more than 270,000 individual
files—resulting in nearly three million separate calculations. The overall accuracy of each of the individual
test’s ability to differentiate between files encrypted using crypto-ransomware and other file types is then
evaluated and each test is compared using this metric in an attempt to identify the entropy method most
suited for encrypted file identification. An investigation was also undertaken to determine if a hybrid
approach, where the results of multiple tests are combined, to discover if an improvement in accuracy
could be achieved.
Keywords: entropy; randomness; crypto-ransomware; mixed file dataset; PRNG
1. Introduction
Ransomware infection remains a current and significant threat to both individuals and
organisations [1], reinforcing the need for organisations to constantly improve their resilience
to such attacks [
2
]. The research community thus continues to develop robust and effective
mitigation techniques. Despite this, recent reports [
3
] indicate that an increasing number of
Entropy 2022,1, 0. https://doi.org/10.3390/e1010000 https://www.mdpi.com/journal/entropy
arXiv:2210.13376v1 [cs.CR] 24 Oct 2022
Entropy 2022,1, 0 2 of 28
organisations are succumbing to such attacks and ultimately paying the requested ransom to
regain control of their data and affected infrastructures.
Although there are countless strains of ransomware, they mainly fall into two main types.
These are crypto-ransomware and locker ransomware. Crypto-ransomware encrypts valuable
files on a computer so that they become unusable. Cyber Criminals that leverage crypto-
ransomware attacks generate income by restricting access to these files until the victim pays
a ransom. Unlike crypto-ransomware, Locker ransomware does not encrypt files. Instead
goes one step further, and it locks the victim out of their device. In these types of attacks,
cybercriminals will demand a ransom to unlock the device. In both types of attack, users can be
left without any option other than payment to recover their files. Over time ransomware attacks
have become more sophisticated, performing tasks such as ex-filtrating data and attempting
lateral movement to infect other machines. In this paper, we will focus solely on the encryption
actions undertaken by crypto-ransomware.
A reoccurring theme within many crypto-ransomware detection techniques is the concept
of randomness and file entropy. Researchers assert that a good indicator [
4
7
] of crypto-
ransomware activity is the generation of files whose contents appears to be random and contain
no distinguishable structure. It is agreed that Well-encrypted data should be indistinguishable
from random data. A problem with this approach is that the contents of the files created as a
consequence of archiving or compression, also tend to have high entropy values, thus interfer-
ing with the above assumption. Some modern file formats, such as the new Microsoft Office
documents, also employ compression as part of the file format, also increasing their overall
entropy. Overall, Shannon entropy appears to be the technique of choice, when researchers
wish to determine the randomness of an action or result. An attribute of files encrypted by
crypto-ransomware is that while the majority of the contents of the encrypted file contains the
encrypted version of the victims file, the file normally also contains extra information relating
to the crypto-ransomware strain and how the file was encrypted. This extra information can
include check sums, encrypted keys and offset values and generally appears at the start or the
end of the file. This extra information could affect the overall entropy profile of the file.
While there exist many techniques available—both purely mathematical, as well as
statistical—to determine the entropy or randomness of a file’s contents; in the majority of
cases, there is often no discussion in the research into why a specific entropy technique was
used over its alternatives. The most common technique used in crypto-ransomware detection
systems is to identify files with random content using the Shannon entropy calculation. The
closer a file’s overall entropy value is to eight bits, the higher the confidence that its con-
tents are encrypted and subsequently—possibly—the consequence of a crypto-ransomware
infection. Other techniques that have been used apart from Shannon entropy are Chi-square,
Kullback-Liebler distance, serial byte correlation and Monte Carlo [8].
This paper describes the work performed by the authors in firstly identifying alternative
file randomness measurement techniques and then subsequently compares their effectiveness.
The work is specifically aimed at measuring the effectiveness of these techniques in being able to
correctly distinguish between crypto-ransomware encrypted files and other files that generally
have a high entropy value, such as with archived or compressed files. When performing this
type of research, achieving convincing and reproducible results [
7
], is highly dependent on
the size, scope and quality of the underlying dataset used. It has been found that seeming
successful lab experiments using preselected datasets subsequently perform poorly in real-
world evaluations [
9
]. To address this point, the paper uses the NapierOne [
10
,
11
] dataset, due
to its broad and comprehensive collection of modern common data types. This dataset also
contains large numbers of examples of high entropy files including encrypted, compressed and
archived files.
Entropy 2022,1, 0 3 of 28
The overall research is broken down into two phases. During the first phase, all of the
53 identified randomness tests were executed against a subset of the NapierOne data set.
Techniques that exhibited a reasonably-high accuracy were then selected for inclusion in the
second phase. During the second phase of tests, the candidates identified in phase one were
re-executed, but against the full NapierOne dataset. This resulted in each randomness test
being executed on 5000 examples of each of the 54 data distinct types contained within the
dataset. It results in performing nearly three million separate entropy calculations on the
NapierOne dataset in an attempt to determine which, if any, of the identified randomness
calculations is best suited for crypto-ransomware detection. A subsequent investigation was
also performed to determine if a hybrid solution of combining separate entropy calculation
results could be used to improve the detection accuracy.
The hypothesis of this paper is that there is a fundamental difference between different
entropy methods and that the best methods can be used to better detect ransomware encrypted
files.
The contributions of this paper being:
The main contribution of this paper is in initially assessing the commonly used entropy
measurement methods, and use these to create a short-list of contenders that are as-
sessed on a commonly used data set, for a range of success rate measurements against
ransomware/encryption detection.
Analyse a large and varied selection of entropy calculation methods in order to determine
if any of the techniques are able to accurately distinguish between files generated by a
crypto-ransomware and other common high entropy files such as compressed or archive
file types.
No other published research covers this extensive scope of entropy calculation analysis,
using a standard data set, and thus provides a core contribution to crypto-ransomware
detection research.
The remainder of the paper is structured as follows. In Section 2, we discuss some of
the major uses of statistics in identifying crypto-ransomware infection, as well as previous
works which leveraged this approach. In Section 3, we provide a brief description of the
tests performed during the research and how the experiments were broken down into two
phases. Section 4explains the methodology used in the experiments and the recorded results.
In Section 5, we discuss the consequences of the findings with respect to the development of
anti-ransomware techniques, and we provide some recommendations for crypto-ransomware
detection approaches moving forwards. Finally, in Section 5, we discuss the main findings
and conclusions gained from this research together with possible limitations in using this
approach and suggest further research that could be conducted based on the findings from
these experiments.
2. Related Work
Many crypto-ransomware detection methods use a metric calculated from the file being
written to disk as an indicator as to whether the contents of the file are encrypted or not. This
calculation is normally used in combination with other identified indicators [
6
,
7
] to determine
if the file being written to disk is the result of a crypto-ransomware attack. Once an attack has
been identified, the system can then decide on an appropriate response, such as retaining copies
of the encryption keys [
12
,
13
], informing the users and/or terminating the processes [
14
,
15
]
that initiated the write. This calculation may be performed directly on the file being written or
alternatively the difference between the calculated value of the file being read and subsequently
written.
Since its discovery in 1948 [
16
], Shannon entropy has been successfully applied in many
fields of research. The majority of its application is in the determination of information, or
Entropy 2022,1, 0 4 of 28
conversely uncertainty inherent in a variable. Within the field of malware research and more
specifically crypto-ransomware research, this technique has been applied in various contexts.
For example, in the measurement of randomness in the behaviour of the crypto-ransomware
code such as in the use of system calls (API’s) [
17
] or in the content of files generated by
these programs. Files with a simple format in their structure and contents, such as plain
text files, tend to have higher predictability and hence lower entropy than files that have
been compressed or encrypted. This difference in entropy can be used as an indicator to
identify when encrypted files, or more specifically high entropy files, are being written to disk.
Significant research has been performed into crypto-ransomware detection using the Shannon
entropy calculation [
6
,
13
15
,
17
29
], resulting in many interesting detection techniques and
tools. However, some researchers do comment on the unsuitability of this technique when
analysing typically higher entropy files [7,30] such as with archive and compressed files.
While Shannon’s entropy has been the technique of choice in the majority of reviewed
research, some researchers have used alternative techniques to identify the randomness of
a file’s contents. These alternative techniques include: chi-square calculation [
18
,
31
35
]; the
Kullback-Liebler [
30
] technique; and Serial byte correlation [
8
,
36
]. In the majority of research
though, no specific reason is often provided as to why the specific entropy calculation is selected
and also it is common that the evaluation was performed on an undefined dataset of limited
scope and variety of file types.
Apart from these pure mathematical techniques where the randomness of the data is
determined from a single calculation, there also exist alternative approaches, such as in the
test suites that have been developed to validate pseudo-random number generation (PRNG)
programs. The aim of random number generators is to produce a series of numbers that do not
display any distinguishable patterns in their appearance or generation. The output of these
random number generators should thus have very high entropy, which would then lend the
suites used to test the outputs of these generators suitable to also test the contents of encrypted
files. There exist many test suites currently available to test the output of random number
generators, the most significant being: NIST SP 800-22 [
37
], the Chinese equivalent to the NIST
test GM/T 0005-2012 [
38
,
39
], FIPS-2-140 [
37
,
40
], DieHarder [
41
,
42
] and the TESTU01 [
43
,
44
]
libraries.
3. Methodology
The main focus of this paper is to compare multiple entropy evaluation techniques using
a large and varied dataset, in an attempt to determine which tests, if any, exhibit the highest
accuracy in differentiating between crypto-ransomware encrypted files from other common
file types. The NapierOne data set [
10
,
11
] was selected as it fulfils the requirement that the test
dataset contains a large selection of commonly used file types including multiple examples of
compressed, archive and encrypted formats. During the provisional research, it was identified
that some of the randomness tests took a significant amount of time to complete, so it was
decided to divide this research into two phases. Firstly, in the qualification phase - all the
identified tests were executed on a smaller subset of files, taken from the NapierOne dataset.
This research is aimed at using randomness identification techniques to differentiate between
crypto-ransomware encrypted files and other file types. The specific file types used to populate
the Phase 1 data subset were selected according to the formats that typically are well-known
for having higher entropy [
7
,
14
]. Tests that produced a true positive rate of above 80% and
were completed within a reasonable amount of time, and were then selected to participate in
the more extensive second phase of testing. The reasoning for this, was that there was little
point in testing further, techniques that are either slow to execute or provide unreliable results.
Entropy 2022,1, 0 5 of 28
3.1. Proposed Tests
The following tests and test suites were considered as part of the first phase of testing.
Tests that performed well with regards to accuracy and performance were later executed in
Phase 2 of the tests.
3.1.1. NIST SP-800 22
The NIST SP800-22 specification [
37
,
40
,
45
] from 2010 describes a suite of tests whose
intended use is to evaluate the quality of random number generators [
21
]. The suite consists of
15 distinct tests, and which analyse various structural aspects of a byte sequence. These tests
are commonly employed as a benchmark for distinguishing compressed and encrypted content
(e.g., [
18
,
31
]). Each test analyses a particular property of the sequence, and subsequently
applies a test-specific decision rule to determine whether the result of the analysis suggests
randomness, or not.
Since the tests only consider the output values and not the implemented methods for
generation, some researchers argue that the tests contained within this suite are not useful [
27
],
arguing that the evaluation of pseudo-random generators and sequences should be based
on cryptanalytic principles. Implementation validation should be focused on algorithmic
correctness, not the randomness of the output. However, it was decided to still incorporate
these tests into our evaluation due to their universal acceptance and use within the field of
randomness testing. In January 2022, NIST confirmed [
46
] that the test suite will be reviewed
and possibly updated in the future.
The specific NIST tests used during the first phase of testing are:
Frequency (Monobit) Test.
Determine whether the number of ones and zeros in an entire
sequence is approximately the same as would be expected for a truly random sequence.
Frequency Test within a Block.
Determine whether the frequency of ones in an M-bit block is
approximately M
2, as would be expected under an assumption of randomness.
Runs Test.
Test the total number of runs in the sequence, where a run is an uninterrupted
sequence of identical bits. The purpose of the Runs test is to determine whether the number of
runs of ones and zeros of various lengths is as-expected for a random sequence. In particular,
this test determines whether the oscillation between such zeros and ones is too fast or too slow.
Test for the Longest Run of Ones in a Block.
Determine whether the length of the longest run
of ones within the tested sequence is consistent with the length of the longest run of ones that
would be expected in a random sequence.
Binary Matrix Rank Test.
Check for linear dependence among fixed-length sub-strings of the
original sequence. Note that this test also appears in the DIEHARD suite of tests [47,48].
Discrete Fourier Transform (Spectral) Test.
Detect periodic features (i.e., repetitive patterns
that are near each other) in the tested sequence that would indicate a deviation from the
assumption of randomness. The intention is to detect whether the number of peaks exceeding
the 95% threshold is significantly different than 5%.
Non-overlapping Template Matching Test.
Count the number of occurrences of pre-specified
target strings and identifying too many occurrences of a given non-periodic (aperiodic) pattern.
An example of an 68-bit aperiodic pattern being 0 1 1 1 1 1 1 1.
Overlapping Template Matching Test.
Similar to the previous test, this test also looks for
occurrences of pre-specified target strings. When a match is found, this test moves the test
window by one byte, whereas the previous test moves the test widow to the end of the matching
sequence.
Maurers Universal Statistical Test.
This is the number of bits between matching patterns (a
measure that is related to the length of a compressed sequence). The purpose of the test is to
detect whether or not the sequence can be significantly compressed without loss of information.
A significantly compressible sequence is considered to be non-random.
摘要:

Citation:Davies,S.R.;Macfarlane,R.;Buchanan,W.J.ComparisonofEntropyCalculationMethodsforRansomwareEncryptedFileIdentication.Entropy2022,1,0.https://doi.org/AcademicEditor:FirstnameLastnameReceived:26September2022Accepted:Published:Publisher'sNote:MDPIstaysneutralwithregardtojurisdictionalclaimsinpu...

展开>> 收起<<
Comparison of Entropy Calculation Methods for Ransomware Encrypted File Identification.pdf

共28页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:28 页 大小:2.8MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 28
客服
关注