A Comprehensive Survey on Edge Data Integrity Verification Fundamentals and Future Trends YAO ZHAO School of Information Technology Deakin University Australia

2025-04-30 1 0 2.53MB 34 页 10玖币

侵权投诉

A Comprehensive Survey on Edge Data Integrity Verification:

Fundamentals and Future Trends

YAO ZHAO, School of Information Technology, Deakin University, Australia

YOUYANG QU, 1. Key Laboratory of Computing Power Network and Information Security, Ministry of

Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences),

2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for

Computer Science, China

YONG XIANG, School of Information Technology, Deakin University, Australia

MD PALASH UDDIN, School of Information Technology, Deakin University, Australia

DEZHONG PENG, College of Computer Science, Sichuan University, China

LONGXIANG GAO

∗

,1. Key Laboratory of Computing Power Network and Information Security, Ministry of

Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences),

2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for

Computer Science, China

Recent advances in edge computing (EC) have pushed cloud-based data caching services to edge, however, such emerging

edge storage comes with numerous challenging and unique security issues. One of them is the problem of edge data integrity

verication (EDIV) which coordinates multiple participants (e.g., data owners and edge nodes) to inspect whether data cached

on edge is authentic. To date, various solutions have been proposed to address the EDIV problem, while there is no systematic

review. Thus, we oer a comprehensive survey for the rst time, aiming to show current research status, open problems, and

potentially promising insights for readers to further investigate this under-explored eld. Specically, we begin by stating the

signicance of the EDIV problem, the integrity verication dierence between data cached on cloud and edge, and three

typical system models with corresponding inspection processes. To thoroughly assess prior research eorts, we synthesize a

universal criteria framework that an eective verication approach should satisfy. On top of it, a schematic development

timeline is developed to reveal the research advance on EDIV in a sequential manner, followed by a detailed review of the

existing EDIV solutions. Finally, we highlight intriguing research challenges and possible directions for future work, along

with a discussion on how forthcoming technology, e.g., machine learning and context-aware security, can augment security

in EC. Given our ndings, some major observations are: there is a noticeable trend to equip EDIV solutions with various

∗Longxiang Gao and Youyang Qu are the co-corresponding authors.

Authors’ addresses: Yao Zhao, School of Information Technology, Deakin University, Australia; Youyang Qu, 1. Key Laboratory of Computing

Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology

(Shandong Academy of Sciences), 2. Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center

for Computer Science, China; Yong Xiang, School of Information Technology, Deakin University, Australia; Md Palash Uddin, School of

Information Technology, Deakin University, Australia; Dezhong Peng, College of Computer Science, Sichuan University, China; Longxiang

Gao, 1. Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center,

Qilu University of Technology (Shandong Academy of Sciences), 2. Shandong Provincial Key Laboratory of Computer Networks, Shandong

Fundamental Research Center for Computer Science, China.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that

copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst

page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy

otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. Request permissions from

permissions@acm.org.

XXXX-XXXX/2024/8-ART $15.00

https://doi.org/10.1145/nnnnnnn.nnnnnnn

, Vol. 1, No. 1, Article . Publication date: August 2024.

arXiv:2210.10978v2 [cs.CR] 7 Aug 2024

2•Yao Zhao, Youyang , Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao

functions and diversify study scenarios; completing EDIV within two types of participants (i.e., data owner and edge nodes) is

garnering escalating interest among researchers; although the majority of existing methods rely on cryptography, emerging

technology is being explored to handle the EDIV problem.

CCS Concepts: •Security and privacy

→

Management and querying of encrypted data;Logic and verication;Public

key (asymmetric) techniques;Symmetric cryptography and hash functions;Distributed systems security.

Additional Key Words and Phrases: Edge Data Integrity Verication, Edge Computing, Security, Internet of things

ACM Reference Format:

Yao Zhao, Youyang Qu, Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao. 2024. A Comprehensive Survey

on Edge Data Integrity Verication: Fundamentals and Future Trends. 1, 1 (August 2024), 34 pages. https://doi.org/10.1145/

nnnnnnn.nnnnnnn

1 INTRODUCTION

The global deployment of mobile and Internet-of-Things (IoTs) devices has been rapidly increasing as a result

of the current expansion of 5G and beyond networks [

]. According to the technical report [

], the number of

these devices is projected to surpass 25.4 billion by 2030. These devices serve as fundamental components for

smart applications to carry out the most basic yet essential activities such as detecting [

], actuating [

], and

controlling [

]. However, it is insucient to depend just on those low-performance devices to complete complex

activities eciently, e.g., smart transportation arrangements [

–

], smart medical treatments [

–

], and smart

vehicle control [

–

]. High-performance computing infrastructures are essential for ooading calculation

tasks and facilitating decision-making. Undoubtedly, cloud computing (CC) [

–

] is the most well-known of

such technologies. In CC environments, cloud infrastructure providers (CIPs), e.g., One Drive

, Amazon

and Google Drive

, deliver centralized data caching services to support large-scale data access [

]. Yet,

cloud computing is incapable of perfectly matching the demands of mobile/IoT services due to the concerns like

geographical unawareness [

], bandwidth limitations [

], a lack of real-time services [

], and unpredictable

data access latency [

]. To this end, an emerging paradigm named edge computing (EC) [

–

] is spawned as

one of the key enabler technologies for 5G and beyond to facilitate latency-sensitive or geo-aware applications,

e.g., autopilot [

], virtual reality [

], and video analytics [

]. The detailed denition and origin of EC can refer

to [

]. Motivated by EC, data owners (DOs) are allowed to outsource popular data on edge nodes (ENs) for

serving nearby data users (DUs) with better user experience [

], as shown in Fig. 1. Due to such benets over

CC, EC has grown dramatically in recent years [

]. The Market Study Report

predicts that the edge data centre

market is expected to exceed $20 billion by 2026.

Unfortunately, this promising computing paradigm still faces alarming security challenges in practice [

–

Dierent from cloud facilitated by mega-scale data centres, edge nodes are usually deployed at base stations

or access points and managed by dierent edge infrastructure providers (EIPs) [

]. This edge caching

strategy is much more distributed, dynamic, and volatile [

], making the integrity of cached data corrupted

easily and frequently. Furthermore, various attacks against EC-related infrastructures have signicantly increased

recently [

]. For instance, Mirai virus [

–

], released in August 2016 and managed to inltrate more than

65,000 IoT devices within the rst 20 hours of that release, is one of the most famous assaults to have ever taken

place in reality. Over 178,000 domains were knocked down as a result of DDoS assaults launched against edge

nodes a few days later using botnets created from these infected devices [

]. IoTReaper and Hajime, two Mirai

variants that were discovered shortly after, were thought to have infected more than 378 million IoT devices in

1https://www.microsoft.com/en-us/microsoft-365/onedrive/online-cloud-storage

2https://aws.amazon.com/

3https://www.google.com/drive/

4https://www.gminsights.com/industry-analysis/edge-data-center-market

, Vol. 1, No. 1, Article . Publication date: August 2024.

A Comprehensive Survey on Edge Data Integrity Verification: Fundamentals and Future Trends •3

Edge Node

Data Owner

Data User

S3Caching Data

Accessing Data

Geographic Area

Fig. 1. Example of edge storage. A data owner caches multiple data replicas to geographically distributed edge nodes (denoted

by 𝑆1,𝑆2,𝑆3) to serve nearby data users (denoted by 𝑢𝑖, 𝑖 ∈ {1,2,· · · ,9}) with ultra-low data access latency.

2017 [34, 44]. These IoT botnet assaults were estimated to have cost over 100 million USD in damages since the

initial Mirai botnet was found in 2016 [34, 45, 46]. More intuitively, researchers have found that various factors

may lead to data loss in real-world scenarios. Based on the report from Kroll Ontrack

, 67% data loss is attributed

to hard drive crashes or system failure, 14% is blamed for human error, and 10% is a result of software failure.

The aforementioned examples and statistics clearly illustrate the unsatisfactory state of edge data security.

Outsourcing data to edge nodes results in the separation of data ownership and management. Thus, data owners

and data users may not always trust edge nodes, since they may misuse data management permissions and

expose data to security risks [

]. Consequently, a variety of issues must be addressed before subscribing to edge

data caching services. For example, how do data owners trust EIPs and ensure that outsourced data is integral all the

time? How to properly audit cached data without retrieving the whole data collection? How to maintain the stable

operation of integrity audit while data owners modify outsourced data? All these challenges could be summarized

to the edge data integrity verication (EDIV) problem [48], which is dened as follows.

Definition 1 (Edge Data Integrity Verification Problem). The edge data integrity verication problem

refers to inspecting the accuracy and consistency (validity) of data replicas cached on edge nodes.

An EDIV approach entails creating a solution that allows data owners or (and) users to verify the integrity

of outsourced data within the edge environment. EDIV investigation is of particularly practical importance for

edge-based services/applications, because critical business decisions depend mostly on accurate edge data. If

the data integrity is compromised, any decision based on that becomes questionable. We further emphasize its

signicance in Section 2.1. To date, numerous great achievements have been made for the EDIV problem, such

as verication eciency improvement [

] and data privacy guarantee [

], however, all these articles have

proposed specic EDIV solutions tailored to their respective domains, exposing the lack of a systematic and

comprehensive review of them.

1.1 Related Surveys and Our Contributions

In the context of data integrity verication, the cloud data integrity verication (CDIV) problem has received

signicant attention over the past decade, resulting in several related surveys [

–

]. For example, Suchetha et

al. [

] oered an overview of various techniques for data integrity verication in cloud storage. Gangadevi et

al. [

] conducted a brief survey on data integrity verication schemes for cloud computing based on blockchain

5https://www.techradar.com/how-to/world-of-tech/management/how-to-recover-lost-business-data-1304303/2

, Vol. 1, No. 1, Article . Publication date: August 2024.

4•Yao Zhao, Youyang , Yong Xiang, Md Palash Uddin, Dezhong Peng, and Longxiang Gao

Section 1: Introduction

A. From Cloud Computing to Edge Computing

B. Data Integrity in Edge

1) What is edge data integrity (EDI) and edge data

2) Why do we inspect data integrity in edge computing?

C. Related Surveys and Our Contributions

Section 2: Edge Data Integrity Verification: an Overview

A. The Significance of Studying EDIV

B. EDIV Versus CDIV

C. System Models and Key Processes

Section 3: Evaluation Criteria on Verification Approach

A. Efficiency Related Indicators

B. Security Related Indicators

C. Functionality Related Indicators

-Recoverability, Fairness, Soundness

-Dynamics Support, Privacy Preservation,

-Batch Support, Blockless Verification, Stateless Verification

Section 4: Existing Edge Data Integrity Verification Solutions

A. The Development Timeline of EDIV

B. Comparison of Existing Solutions

C. Pros and Cons of Existing Solutions

-Public Audit, Private Audit, Cooperative Audit

-Corresponding audit processes

-Device Related, Security Related, Approach Related

Section 5: Open Challenges and Potential Solutions

A. Traditional Problems

B. Outspread Problems

Section 6: Discussion

-Efficiency Improvement, Security Guarantee, Support Data

-Cooperative Verification, Scenario Extension, Re-outsourcing

Recovery, Support Data Dynamics, Privacy Preserving

Study, Heterogeneity Consideration, Reputation Management

Unrestricted Verification Frequency

Detection, Unreliable Data Replica Selection, Frequency

Section 7: Summary

integrity verification (EDIV)?

Fig. 2. Roadmap of the survey

technology. Han et al. [

] provided a survey of blockchain-based integrity auditing for cloud data recently,

including evaluation criteria, a review of existing solutions, and suggestions for future research directions.

Despite the comprehensive surveys on CDIV, EDIV exhibits fundamental dierences from CDIV, as articulated in

Section 2.2.

Motivated by it, this work aims to provide new insights into data integrity in edge computing domains. To our

best knowledge, this is the rst survey on edge data integrity verication. In this survey, we begin with describing

the motivation for studying EDIV problems and providing a comprehensive comparison between CDIV and EDIV.

Afterward, three typical system models with corresponding key processes are covered, together with a set of

criteria that an eective EDIV approach should satisfy. After that, depending on the design objectives, we comb

through a taxonomy of EDIV solutions, ranging from 2019 to 2023. Finally, we highlight unresolved challenges

and make recommendations for further research. This is done to clarify the link between CDIV and EDIV, as well

as to promote future development and integration of EDIV. More signicantly, we oer a valuable resource for

follow-up researchers and amateurs. The primary contributions of this survey are overviewed as follows.

•

We clarify the gravity and signicance of the EDIV study and summarize its uniqueness compared with

CDIV. In addition, the system models along with the key processes of EDIV are introduced in detail.

•

We synthesize a comprehensive criteria system that a satisfactory EDIV solution is expected to meet, which

can be further applied to assess the eectiveness of EDIV methods.

•

According to the established criteria, a chronological timeline is given to outline the evolution of existing

endeavors for the EDIV problem. Furthermore, we systematically categorize current solutions into three

types, emphasizing their strengths while exposing their shortcomings.

•

We identify a list of open issues and further exploit future research directions including traditional and

outspread ones to promote dedicated eorts on the EDIV problem. Notably, some of the valuable directions

have barely or even never been investigated yet. We hope it could provide insights for follow-up researchers.

1.2 Paper Organization

The remainder of this survey is structured as follows. The motivation and overview of the EDIV problem are

presented in Section 2. In Section 3, we propose a series of criteria regarding the evaluation of existing EDIV

solutions. A development timeline and taxonomy on EDIV are summarized and the existing works are reviewed

accordingly in Section 4. In Section 5, several future research directions and potential solutions are introduced,

while the impact of emerging technologies in enhancing security for EC is discussed in Section 6. Finally, a

, Vol. 1, No. 1, Article . Publication date: August 2024.

A Comprehensive Survey on Edge Data Integrity Verification: Fundamentals and Future Trends •5

Table 1. List of key abbreviations

Abbr. Denition Abbr. Denition Abbr. Denition

IoTs Internet-of-Things CC Cloud Computing CIP Cloud Infrastructure Provider

EC Edge Computing DO Data Owner EN Edge Node

DU Data User EIP Edge Infrastructure Provider EDIV Edge Data Integrity Verication

CDIV Cloud Data Integrity Verication SLA Service Level Agreement TPA Third Party Auditor

EDI Edge Data Integrity PDP Provable Data Possession POR Proof of Retrievability

summary is provided in Section 7. For clarity, we illustrate the organization of this work in Fig. 2, and key

acronyms are outlined in Table 1.

2 EDGE DATA INTEGRITY VERIFICATION: AN OVERVIEW

To better understand the scope and breadth of the EDIV problem, in this section, we state the signicance of

EDIV investigation. Then, we explicitly present the discrepancy between CDIV and EDIV. Further, we provide

a summary of three commonly-used system models, along with a short introduction to the corresponding key

processes concerning the EDIV problem-solving strategies.

2.1 The Significance of Edge Data Integrity Verification

To some extent, data cached on cloud is more reliable and stable than on edge nodes [

], since cloud servers

have adequate resources to achieve computation-intensive inspection tasks, while edge nodes often can not

aord to perform the same level of integrity assurance [

]. In reality, however, data corruption accidents occur

frequently even in cloud. According to a comprehensive study [

], existing cloud data corruption detection

schemes are quite insucient. Specically, only 25% of data corruption problems are reported correctly, 42%

are undetected, and 21% receive imprecise error reports. They also found that the detection system raises

12% false alarms. Real examples include but not limited to the following ones. Je Bonwick, the ZFS

creator,

mentioned that a fast database named Greenplum

faces undetected data corruption every 10 to 20 minutes

Additionally, NetApp

conducted 41-month real-world research on more than 1.5 million hard disk drives and

identied over 400,000 undiscovered data corruptions, including more than 30,000 undetectable by hardware

RAID controllers [

]. Besides, during the course of six months and involving around 97 petabytes of data,

CERN10 discovered that approximately 128 megabytes of data got irreversibly corrupted [65].

The above analysis clearly reveals that detecting data corruption is a challenging problem in cloud domains,

let alone in dynamic edge computing environments. Briey, studying the EDIV problem has the following

signicance from the utility perspective.

Cut Data Owners’ Loss. Edge data corruption has a lasting impact on data owners’ businesses. For recoverable

data, detecting corruption as soon as possible can help data owners recover correct data in a timely way so that

eective measures can be taken to shrink the gaps left by corruption [

]. For unrecoverable data, identifying

corruption eciently can assist data owners in designing emergency plans to minimize unnecessary delay and

possible loss of business reputation and revenue [

]. Furthermore, inspecting edge data integrity (EDI) presents

6https://en.wikipedia.org/wiki/ZFS

7https://greenplum.org/

8https://queue.acm.org/detail.cfm?id=1317400

9https://www.netapp.com/

10http://home.cern/

, Vol. 1, No. 1, Article . Publication date: August 2024.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AComprehensiveSurveyonEdgeDataIntegrityVerification:FundamentalsandFutureTrendsYAOZHAO,SchoolofInformationTechnology,DeakinUniversity,AustraliaYOUYANGQU,1.KeyLaboratoryofComputingPowerNetworkandInformationSecurity,MinistryofEducation,ShandongComputerScienceCenter,QiluUniversityofTechnology(ShandongA...

展开>> 收起<<

A Comprehensive Survey on Edge Data Integrity Verification Fundamentals and Future Trends YAO ZHAO School of Information Technology Deakin University Australia.pdf

共34页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Comprehensive Survey on Edge Data Integrity Verification Fundamentals and Future Trends YAO ZHAO School of Information Technology Deakin University Australia

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: