A Comprehensive Survey on Edge Data Integrity Verification Fundamentals and Future Trends YAO ZHAO School of Information Technology Deakin University Australia

2025-04-30 0 0 2.53MB 34 页 10玖币
侵权投诉
A Comprehensive Survey on Edge Data Integrity Verification:
Fundamentals and Future Trends
YAO ZHAO, School of Information Technology, Deakin University, Australia
YOUYANG QU, 1. Key Laboratory of Computing Power Network and Information Security, Ministry of
Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences),
2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for
Computer Science, China
YONG XIANG, School of Information Technology, Deakin University, Australia
MD PALASH UDDIN, School of Information Technology, Deakin University, Australia
DEZHONG PENG, College of Computer Science, Sichuan University, China
LONGXIANG GAO
,1. Key Laboratory of Computing Power Network and Information Security, Ministry of
Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences),
2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for
Computer Science, China
Recent advances in edge computing (EC) have pushed cloud-based data caching services to edge, however, such emerging
edge storage comes with numerous challenging and unique security issues. One of them is the problem of edge data integrity
verication (EDIV) which coordinates multiple participants (e.g., data owners and edge nodes) to inspect whether data cached
on edge is authentic. To date, various solutions have been proposed to address the EDIV problem, while there is no systematic
review. Thus, we oer a comprehensive survey for the rst time, aiming to show current research status, open problems, and
potentially promising insights for readers to further investigate this under-explored eld. Specically, we begin by stating the
signicance of the EDIV problem, the integrity verication dierence between data cached on cloud and edge, and three
typical system models with corresponding inspection processes. To thoroughly assess prior research eorts, we synthesize a
universal criteria framework that an eective verication approach should satisfy. On top of it, a schematic development
timeline is developed to reveal the research advance on EDIV in a sequential manner, followed by a detailed review of the
existing EDIV solutions. Finally, we highlight intriguing research challenges and possible directions for future work, along
with a discussion on how forthcoming technology, e.g., machine learning and context-aware security, can augment security
in EC. Given our ndings, some major observations are: there is a noticeable trend to equip EDIV solutions with various
Longxiang Gao and Youyang Qu are the co-corresponding authors.
Authors’ addresses: Yao Zhao, School of Information Technology, Deakin University, Australia; Youyang Qu, 1. Key Laboratory of Computing
Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology
(Shandong Academy of Sciences), 2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center
for Computer Science, China; Yong Xiang, School of Information Technology, Deakin University, Australia; Md Palash Uddin, School of
Information Technology, Deakin University, Australia; Dezhong Peng, College of Computer Science, Sichuan University, China; Longxiang
Gao, 1. Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center,
Qilu University of Technology (Shandong Academy of Sciences), 2. Shandong Provincial Key Laboratory of Computer Networks, Shandong
Fundamental Research Center for Computer Science, China.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst
page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from
permissions@acm.org.
©2024 Association for Computing Machinery.
XXXX-XXXX/2024/8-ART $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
, Vol. 1, No. 1, Article . Publication date: August 2024.
arXiv:2210.10978v2 [cs.CR] 7 Aug 2024
2Yao Zhao, Youyang , Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao
functions and diversify study scenarios; completing EDIV within two types of participants (i.e., data owner and edge nodes) is
garnering escalating interest among researchers; although the majority of existing methods rely on cryptography, emerging
technology is being explored to handle the EDIV problem.
CCS Concepts: Security and privacy
Management and querying of encrypted data;Logic and verication;Public
key (asymmetric) techniques;Symmetric cryptography and hash functions;Distributed systems security.
Additional Key Words and Phrases: Edge Data Integrity Verication, Edge Computing, Security, Internet of things
ACM Reference Format:
Yao Zhao, Youyang Qu, Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao. 2024. A Comprehensive Survey
on Edge Data Integrity Verication: Fundamentals and Future Trends. 1, 1 (August 2024), 34 pages. https://doi.org/10.1145/
nnnnnnn.nnnnnnn
1 INTRODUCTION
The global deployment of mobile and Internet-of-Things (IoTs) devices has been rapidly increasing as a result
of the current expansion of 5G and beyond networks [
1
]. According to the technical report [
2
], the number of
these devices is projected to surpass 25.4 billion by 2030. These devices serve as fundamental components for
smart applications to carry out the most basic yet essential activities such as detecting [
3
], actuating [
4
], and
controlling [
5
]. However, it is insucient to depend just on those low-performance devices to complete complex
activities eciently, e.g., smart transportation arrangements [
6
8
], smart medical treatments [
9
11
], and smart
vehicle control [
12
14
]. High-performance computing infrastructures are essential for ooading calculation
tasks and facilitating decision-making. Undoubtedly, cloud computing (CC) [
15
17
] is the most well-known of
such technologies. In CC environments, cloud infrastructure providers (CIPs), e.g., One Drive
1
, Amazon
2
,
and Google Drive
3
, deliver centralized data caching services to support large-scale data access [
18
,
19
]. Yet,
cloud computing is incapable of perfectly matching the demands of mobile/IoT services due to the concerns like
geographical unawareness [
20
], bandwidth limitations [
21
], a lack of real-time services [
22
], and unpredictable
data access latency [
23
]. To this end, an emerging paradigm named edge computing (EC) [
24
26
] is spawned as
one of the key enabler technologies for 5G and beyond to facilitate latency-sensitive or geo-aware applications,
e.g., autopilot [
27
], virtual reality [
28
], and video analytics [
29
]. The detailed denition and origin of EC can refer
to [
30
]. Motivated by EC, data owners (DOs) are allowed to outsource popular data on edge nodes (ENs) for
serving nearby data users (DUs) with better user experience [
31
], as shown in Fig. 1. Due to such benets over
CC, EC has grown dramatically in recent years [
32
]. The Market Study Report
4
predicts that the edge data centre
market is expected to exceed $20 billion by 2026.
Unfortunately, this promising computing paradigm still faces alarming security challenges in practice [
33
36
].
Dierent from cloud facilitated by mega-scale data centres, edge nodes are usually deployed at base stations
or access points and managed by dierent edge infrastructure providers (EIPs) [
37
]. This edge caching
strategy is much more distributed, dynamic, and volatile [
38
], making the integrity of cached data corrupted
easily and frequently. Furthermore, various attacks against EC-related infrastructures have signicantly increased
recently [
39
]. For instance, Mirai virus [
40
42
], released in August 2016 and managed to inltrate more than
65,000 IoT devices within the rst 20 hours of that release, is one of the most famous assaults to have ever taken
place in reality. Over 178,000 domains were knocked down as a result of DDoS assaults launched against edge
nodes a few days later using botnets created from these infected devices [
43
]. IoTReaper and Hajime, two Mirai
variants that were discovered shortly after, were thought to have infected more than 378 million IoT devices in
1https://www.microsoft.com/en-us/microsoft-365/onedrive/online-cloud-storage
2https://aws.amazon.com/
3https://www.google.com/drive/
4https://www.gminsights.com/industry-analysis/edge-data-center-market
, Vol. 1, No. 1, Article . Publication date: August 2024.
A Comprehensive Survey on Edge Data Integrity Verification: Fundamentals and Future Trends 3
u5
Edge Node
Data Owner
Data User
u6
u4
S2
u1
u2
u3
S1
u7
u8
u9
S3Caching Data
Accessing Data
Geographic Area
Fig. 1. Example of edge storage. A data owner caches multiple data replicas to geographically distributed edge nodes (denoted
by 𝑆1,𝑆2,𝑆3) to serve nearby data users (denoted by 𝑢𝑖, 𝑖 ∈ {1,2,· · · ,9}) with ultra-low data access latency.
2017 [34, 44]. These IoT botnet assaults were estimated to have cost over 100 million USD in damages since the
initial Mirai botnet was found in 2016 [34, 45, 46]. More intuitively, researchers have found that various factors
may lead to data loss in real-world scenarios. Based on the report from Kroll Ontrack
5
, 67% data loss is attributed
to hard drive crashes or system failure, 14% is blamed for human error, and 10% is a result of software failure.
The aforementioned examples and statistics clearly illustrate the unsatisfactory state of edge data security.
Outsourcing data to edge nodes results in the separation of data ownership and management. Thus, data owners
and data users may not always trust edge nodes, since they may misuse data management permissions and
expose data to security risks [
47
]. Consequently, a variety of issues must be addressed before subscribing to edge
data caching services. For example, how do data owners trust EIPs and ensure that outsourced data is integral all the
time? How to properly audit cached data without retrieving the whole data collection? How to maintain the stable
operation of integrity audit while data owners modify outsourced data? All these challenges could be summarized
to the edge data integrity verication (EDIV) problem [48], which is dened as follows.
Definition 1 (Edge Data Integrity Verification Problem). The edge data integrity verication problem
refers to inspecting the accuracy and consistency (validity) of data replicas cached on edge nodes.
An EDIV approach entails creating a solution that allows data owners or (and) users to verify the integrity
of outsourced data within the edge environment. EDIV investigation is of particularly practical importance for
edge-based services/applications, because critical business decisions depend mostly on accurate edge data. If
the data integrity is compromised, any decision based on that becomes questionable. We further emphasize its
signicance in Section 2.1. To date, numerous great achievements have been made for the EDIV problem, such
as verication eciency improvement [
48
] and data privacy guarantee [
49
], however, all these articles have
proposed specic EDIV solutions tailored to their respective domains, exposing the lack of a systematic and
comprehensive review of them.
1.1 Related Surveys and Our Contributions
In the context of data integrity verication, the cloud data integrity verication (CDIV) problem has received
signicant attention over the past decade, resulting in several related surveys [
50
59
]. For example, Suchetha et
al. [
51
] oered an overview of various techniques for data integrity verication in cloud storage. Gangadevi et
al. [
57
] conducted a brief survey on data integrity verication schemes for cloud computing based on blockchain
5https://www.techradar.com/how-to/world-of-tech/management/how-to-recover-lost-business-data-1304303/2
, Vol. 1, No. 1, Article . Publication date: August 2024.
4Yao Zhao, Youyang , Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao
Section 1: Introduction
A. From Cloud Computing to Edge Computing
B. Data Integrity in Edge
1) What is edge data integrity (EDI) and edge data
2) Why do we inspect data integrity in edge computing?
C. Related Surveys and Our Contributions
Section 2: Edge Data Integrity Verification: an Overview
A. The Significance of Studying EDIV
B. EDIV Versus CDIV
C. System Models and Key Processes
Section 3: Evaluation Criteria on Verification Approach
A. Efficiency Related Indicators
B. Security Related Indicators
C. Functionality Related Indicators
-Recoverability, Fairness, Soundness
-Dynamics Support, Privacy Preservation,
-Batch Support, Blockless Verification, Stateless Verification
Section 4: Existing Edge Data Integrity Verification Solutions
A. The Development Timeline of EDIV
B. Comparison of Existing Solutions
C. Pros and Cons of Existing Solutions
-Public Audit, Private Audit, Cooperative Audit
-Corresponding audit processes
-Device Related, Security Related, Approach Related
Section 5: Open Challenges and Potential Solutions
A. Traditional Problems
B. Outspread Problems
Section 6: Discussion
-Efficiency Improvement, Security Guarantee, Support Data
-Cooperative Verification, Scenario Extension, Re-outsourcing
Recovery, Support Data Dynamics, Privacy Preserving
Study, Heterogeneity Consideration, Reputation Management
Unrestricted Verification Frequency
Detection, Unreliable Data Replica Selection, Frequency
Section 7: Summary
integrity verification (EDIV)?
Fig. 2. Roadmap of the survey
technology. Han et al. [
59
] provided a survey of blockchain-based integrity auditing for cloud data recently,
including evaluation criteria, a review of existing solutions, and suggestions for future research directions.
Despite the comprehensive surveys on CDIV, EDIV exhibits fundamental dierences from CDIV, as articulated in
Section 2.2.
Motivated by it, this work aims to provide new insights into data integrity in edge computing domains. To our
best knowledge, this is the rst survey on edge data integrity verication. In this survey, we begin with describing
the motivation for studying EDIV problems and providing a comprehensive comparison between CDIV and EDIV.
Afterward, three typical system models with corresponding key processes are covered, together with a set of
criteria that an eective EDIV approach should satisfy. After that, depending on the design objectives, we comb
through a taxonomy of EDIV solutions, ranging from 2019 to 2023. Finally, we highlight unresolved challenges
and make recommendations for further research. This is done to clarify the link between CDIV and EDIV, as well
as to promote future development and integration of EDIV. More signicantly, we oer a valuable resource for
follow-up researchers and amateurs. The primary contributions of this survey are overviewed as follows.
We clarify the gravity and signicance of the EDIV study and summarize its uniqueness compared with
CDIV. In addition, the system models along with the key processes of EDIV are introduced in detail.
We synthesize a comprehensive criteria system that a satisfactory EDIV solution is expected to meet, which
can be further applied to assess the eectiveness of EDIV methods.
According to the established criteria, a chronological timeline is given to outline the evolution of existing
endeavors for the EDIV problem. Furthermore, we systematically categorize current solutions into three
types, emphasizing their strengths while exposing their shortcomings.
We identify a list of open issues and further exploit future research directions including traditional and
outspread ones to promote dedicated eorts on the EDIV problem. Notably, some of the valuable directions
have barely or even never been investigated yet. We hope it could provide insights for follow-up researchers.
1.2 Paper Organization
The remainder of this survey is structured as follows. The motivation and overview of the EDIV problem are
presented in Section 2. In Section 3, we propose a series of criteria regarding the evaluation of existing EDIV
solutions. A development timeline and taxonomy on EDIV are summarized and the existing works are reviewed
accordingly in Section 4. In Section 5, several future research directions and potential solutions are introduced,
while the impact of emerging technologies in enhancing security for EC is discussed in Section 6. Finally, a
, Vol. 1, No. 1, Article . Publication date: August 2024.
A Comprehensive Survey on Edge Data Integrity Verification: Fundamentals and Future Trends 5
Table 1. List of key abbreviations
Abbr. Denition Abbr. Denition Abbr. Denition
IoTs Internet-of-Things CC Cloud Computing CIP Cloud Infrastructure Provider
EC Edge Computing DO Data Owner EN Edge Node
DU Data User EIP Edge Infrastructure Provider EDIV Edge Data Integrity Verication
CDIV Cloud Data Integrity Verication SLA Service Level Agreement TPA Third Party Auditor
EDI Edge Data Integrity PDP Provable Data Possession POR Proof of Retrievability
summary is provided in Section 7. For clarity, we illustrate the organization of this work in Fig. 2, and key
acronyms are outlined in Table 1.
2 EDGE DATA INTEGRITY VERIFICATION: AN OVERVIEW
To better understand the scope and breadth of the EDIV problem, in this section, we state the signicance of
EDIV investigation. Then, we explicitly present the discrepancy between CDIV and EDIV. Further, we provide
a summary of three commonly-used system models, along with a short introduction to the corresponding key
processes concerning the EDIV problem-solving strategies.
2.1 The Significance of Edge Data Integrity Verification
To some extent, data cached on cloud is more reliable and stable than on edge nodes [
60
,
61
], since cloud servers
have adequate resources to achieve computation-intensive inspection tasks, while edge nodes often can not
aord to perform the same level of integrity assurance [
62
]. In reality, however, data corruption accidents occur
frequently even in cloud. According to a comprehensive study [
63
], existing cloud data corruption detection
schemes are quite insucient. Specically, only 25% of data corruption problems are reported correctly, 42%
are undetected, and 21% receive imprecise error reports. They also found that the detection system raises
12% false alarms. Real examples include but not limited to the following ones. Je Bonwick, the ZFS
6
creator,
mentioned that a fast database named Greenplum
7
faces undetected data corruption every 10 to 20 minutes
8
.
Additionally, NetApp
9
conducted 41-month real-world research on more than 1.5 million hard disk drives and
identied over 400,000 undiscovered data corruptions, including more than 30,000 undetectable by hardware
RAID controllers [
64
]. Besides, during the course of six months and involving around 97 petabytes of data,
CERN10 discovered that approximately 128 megabytes of data got irreversibly corrupted [65].
The above analysis clearly reveals that detecting data corruption is a challenging problem in cloud domains,
let alone in dynamic edge computing environments. Briey, studying the EDIV problem has the following
signicance from the utility perspective.
Cut Data Owners’ Loss. Edge data corruption has a lasting impact on data owners’ businesses. For recoverable
data, detecting corruption as soon as possible can help data owners recover correct data in a timely way so that
eective measures can be taken to shrink the gaps left by corruption [
64
,
66
]. For unrecoverable data, identifying
corruption eciently can assist data owners in designing emergency plans to minimize unnecessary delay and
possible loss of business reputation and revenue [
67
]. Furthermore, inspecting edge data integrity (EDI) presents
6https://en.wikipedia.org/wiki/ZFS
7https://greenplum.org/
8https://queue.acm.org/detail.cfm?id=1317400
9https://www.netapp.com/
10http://home.cern/
, Vol. 1, No. 1, Article . Publication date: August 2024.
摘要:

AComprehensiveSurveyonEdgeDataIntegrityVerification:FundamentalsandFutureTrendsYAOZHAO,SchoolofInformationTechnology,DeakinUniversity,AustraliaYOUYANGQU,1.KeyLaboratoryofComputingPowerNetworkandInformationSecurity,MinistryofEducation,ShandongComputerScienceCenter,QiluUniversityofTechnology(ShandongA...

收起<<
A Comprehensive Survey on Edge Data Integrity Verification Fundamentals and Future Trends YAO ZHAO School of Information Technology Deakin University Australia.pdf

共34页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:34 页 大小:2.53MB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 34
客服
关注