Annotating Privacy Policies in the Sharing Economy Fahimeh Ebrahimi Miroslav Tushev and Anas Mahmoud The Division of Computer Science and Engineering

2025-04-30 0 0 513.22KB 12 页 10玖币

侵权投诉

Annotating Privacy Policies in the Sharing Economy

Fahimeh Ebrahimi, Miroslav Tushev, and Anas Mahmoud

The Division of Computer Science and Engineering

Louisiana State University

febrah1@lsu.edu, mtushe1@lsu.edu, amahmo4@lsu.edu

Abstract—Applications (apps) of the Digital Sharing Economy

(DSE), such as Uber, Airbnb, and TaskRabbit, have become

a main enabler of economic growth and shared prosperity in

modern-day societies. However, the complex exchange of goods,

services, and data that takes place over these apps frequently

puts their end-users’ privacy at risk. Privacy policies of DSE apps

are provided to disclose how private user data is being collected

and handled. However, in reality, such policies are verbose and

difﬁcult to understand, leaving DSE users vulnerable to privacy

intrusive practices. To address these concerns, in this paper, we

propose an automated approach for annotating privacy policies in

the DSE market. Our approach identiﬁes data collection claims in

these policies and maps them to the quality features of their apps.

Visual and textual annotations are then used to further explain

and justify these claims. The proposed approach is evaluated

with 18 DSE app users. The results show that annotating privacy

policies can signiﬁcantly enhance their comprehensibility to the

average DSE user. Our ﬁndings are intended to help DSE app

developers to draft more comprehensible privacy policies as well

as help their end-users to make more informed decisions in one

of the fastest growing software ecosystems in the world.

Index Terms—Privacy Policy, Sharing Economy, Annotation

I. INTRODUCTION

The Digital Sharing Economy (DSE), also known as the gig

or shared economy, refers to a sustainable form of business

exchange that provides access to, rather than ownership of,

assets and resources via direct Peer-to-Peer (P2P) coordina-

tion [1]. Over the past decade, applications of the Sharing

Economy, such as Uber, TaskRabbit, and Airbnb, have caused

major disturbances in established classical markets, enabling

people to exchange and monetize their idle or underused assets

at unprecedented scales. This unique form of collaborative

consumption has been linked to signiﬁcant levels of eco-

nomic growth, helping unemployed and partially employed

individuals to generate income, increase reciprocity, and access

resources that can be unattainable otherwise [2], [3], [4]. As of

today, there are thousands of active DSE platforms, operating

in a market sector that is projected to grow to close to 335

billion U.S. dollars by 2025 [5].

In order to mediate business transactions, DSE apps con-

stantly demand access to private user information, including

their credit information, geo-location, and even photos of the

assets being shared (e.g., a vehicle or an apartment) [6],

[7], [8], [9]. Such information is used to establish the P2P

connection between service providers and receivers, facilitate

identiﬁcation ofﬂine, and optimize and manage transactions.

Modern app stores require apps that operate on users’ sensitive

information to provide privacy policies in which their data

practices are declared and justiﬁed [10]. These policies are

intended to provide information about the types of user infor-

mation the app collects, how that data is being used, shared,

transferred, and protected [11], [12]. However, in reality,

these policies are often verbose, ambiguous, and jargon-heavy,

making them difﬁcult to understand by the average user [13],

[14]. This can be particularly problematic in the DSE market,

where an unintended leak of users’ private information can

expose them to great physical and ﬁnancial risks [7], [8], [9].

To help overcome these limitations, in this paper, we pro-

pose a novel approach for annotating privacy policies in the

DSE market [15]. Our objective is to make these policies more

comprehensible to the average DSE user. Providing easy-to-

understand policies can protect users from the intrusive privacy

tactics of apps and help them make more informed socio-

economic decisions when it comes to navigating the landscape

of DSE platforms. Technically, our approach utilizes text

classiﬁcation techniques to extract data collection practices

from DSE privacy policies and then map these practices

(claims) to the quality features of their apps. The extracted in-

formation is then color-coded, annotated, and presented using

an enhanced hypertext format. The impact of our annotations

is then evaluated through a user study with 18 DSE app users.

The remainder of this paper is organized as follows. Sec-

tion II discusses privacy in the DSE market and motivates

our research. Section III describes our data collection and

analysis process. Section IV presents our automated policy

annotation approach. Section V empirically evaluates our

proposed approach. Section VI discusses our ﬁndings and

their implications. Section VII addresses the limitations of our

study. Finally, Section VIII concludes the paper.

II. BACKGROUND AND MOTIVATION

In this section, we discuss user privacy in the DSE market

and review existing research on privacy policy annotation. We

then motivate our work and present our research questions.

A. Privacy in the Digital Sharing Economy

The proliferation of DSE apps over the past decade along

with their unique operational characteristics have imposed new

challenges on end-users privacy. Privacy in the DSE market

is a compounded concept, including interrelated concerns of

data and physical privacy [6]. Fig. 1 shows the different

channels of data exchange in a typical DSE transaction [7].

In order to be approved as a service provider (e.g., Uber

drivers or Airbnb hosts), users need to disclose their personally

arXiv:2210.14993v1 [cs.CR] 26 Oct 2022

Organization

ProviderConsumer

Good / Service

Access to

the platform

Access to the

platform (+ earnings)

Provider data

Consumer data

(payment)

Fig. 1: A summary of data exchange between users and

organizations in DSE apps [7].

identiﬁable information (PII) to DSE organizations (e.g., Uber

or Airbnb), including their legal name, address, and picture

along with their shared property’s information, such as the

model of the vehicle being shared or the number of rooms

in the rental space. Service consumers (e.g. Uber riders or

Airbnb renters) also need to share their PII and ﬁnancial

information with DSE companies in order for their service

requests to be approved. DSE companies use their users’

information to mediate their transactions (e.g., route the Uber

driver) and charge for the service. Some of this information

is also made available to service providers and receivers to

facilitate identiﬁcation ofﬂine.

Psychical privacy in the Sharing Economy refers to the

threats that are often associated with sharing personal re-

sources with strangers [16], [9]. Such concerns tend to be

domain-speciﬁc, inﬂuenced by the nature of the DSE trans-

action. For instance, recent research has revealed that the

presence of smart devices in Airbnb rentals can raise privacy

concerns among guests, including concerns of excessive mon-

itoring and access to their Internet history [8]. A survey of

Uber riders exposed common concerns of stalking, coercion,

and habit discovery among both riders and drivers [9]. DSE

platforms attempt to mediate these concerns by establishing

trust in the process. Trust is a key user goal in DSE [17],

[18], [16], [19]. Providers and receivers at both ends of the

P2P connection need to maintain a minimum level of mutual

trust before a transaction can take place [19], [18]. However,

in order to establish trust, apps demand the disclosure of

even more personal information (e.g. social media accounts

or hobbies) to make their users appear more trustworthy [20],

[21].

In general, privacy in the DSE market is a multidimensional

problem. The problem is further exacerbated by the fact

that DSE apps extend over a broad range of application

domains [22]. Identifying and mitigating privacy threats in

each speciﬁc domain can be a very challenging task, and the

failure to do so can lead to a decline in sharing intensity [23].

This can prevent users, especially in communities at the lower

end of the economic ladder, from reaching their full economic

and social potential in the DSE market [7].

B. Privacy Policy Annotation

Privacy policies are unilateral contracts by which organiza-

tions describe their data practices and inform end-users about

their data collection, usage, and sharing practices. Popular app

marketplaces, such as Google Play and the Apple App Store,

require apps to post their privacy policies online. However,

recent analysis of privacy policies in the mobile app market

revealed that most of these policies are ambiguous, and

oftentimes contain incomplete or deceptive information [24].

This undermines the utility of these policies and diminishes

users’ trust in their apps [25].

Several methods have been proposed in the literature to

enhance the quality of privacy policies. Such techniques

commonly rely on annotating the content of these policies

with supplemental information to clarify ambiguous privacy

claims and justify data usage practices. Annotations are often

carried out manually, either by domain experts or using crowd-

sourcing [26], [27], [28], [14], [25]. Expert annotations, while

usually accurate, are frequently associated with signiﬁcant

effort. For instance, forming multidisciplinary expert teams

for speciﬁc application domains can be a time-consuming

process that hardly scales up. Crowd-sourcing can also lead

to suboptimal results due to the general lack of technical and

legal knowledge necessary to produce correct annotations [15],

[29], [24], [27], [28].

To overcome these limitations, several attempts have been

made to automate the annotation process [15]. For instance,

Wilson et al. [15] annotated a corpus of 128 policies to

23,000 ﬁne-grained data practices to be used as a benchmark

in automated annotation tasks. In general, automated tech-

niques use Natural Language Processing (NLP) and Machine

Learning (ML) to identify salient paragraphs that describe

data practices, such as data collection, sharing, security, and

retention in privacy policies [15], [30]. The main limitation

of such methods is that they are commonly applied as generic

solutions. This can impact the validity of these techniques due

to the fact that privacy concerns tend to be domain-speciﬁc.

For instance, constant location tracking is more justiﬁable in

ride-hailing apps than in social media apps. Therefore, a one-

size-ﬁts-all solution often generates misleading results.

C. Motivation and Research Questions

Our brief review shows that people’s participation in sharing

activities might be hindered by the unjustiﬁed privacy practices

of DSE apps. In general, people might abstain from sharing

if they cannot justify the trade-off between the beneﬁts and

the risks of using DSE apps. DSE apps try to mitigate these

concerns in their privacy policies. However, the majority of

these policies are extremely lengthy, and oftentimes ambigu-

ous, leaving users exposed to unlawful privacy practices,

such as constantly and unnecessarily tracking their location,

learning their routine habits, and inferring their consumption

preferences [9], [8]. For instance, the average length of privacy

policies of popular DSE apps (Table I) is around 5,425

words, which is signiﬁcantly longer than the average length of

online privacy policies (2,000 ∼3,000 words) [31], [32], [33].

Furthermore, the average readability of these apps’ policies, as

measured by the Flesch Reading Ease (FRE) [34], [35], [36],

is around 29, which indicates that they are better understood

by college graduates. This can be particularly problematic in

the DSE market as recent statistics showed that a considerable

percentage of DSE users either do not possess a college-level

education or still college students [37], [5]. These observations

emphasize the need for a supporting mechanism to help

DSE users better assess the privacy risks of disclosing their

personal information to DSE apps. In fact, such a mechanism

can also help policymakers and regulators to better evaluate

the privacy practices of DSE apps, and consequently, devise

legislation to protect workers’ and consumers’ rights in one of

the fastest-growing, most diverse, yet under-regulated software

ecosystems in the world [38], [39].

Motivated by these observations, in this paper, we propose

a novel approach for annotating privacy policies in the DSE

market. Our approach automatically classiﬁes data claims in

these policies and maps them to the quality features of the

app (e.g., security, safety, customizability, etc.). The main

assumption is that these generic categories of abstract system

features can be more easily comprehensible for the average

user. In particular, our research questions are:

∙RQ1:Can the privacy policies of DSE apps be auto-

matically annotated? Under this research question, we

investigate the effectiveness of several automated classi-

ﬁcation techniques in annotating data collection claims in

the privacy policies of DSE apps.

∙RQ2:Are annotated privacy policies more comprehen-

sible to the average DSE user? Under this research

question, we explore whether our annotated policies can

actually help average DSE users to better understand the

privacy practices of their DSE apps.

III. DATA AND MANUAL ANNOTATION

Our approach can be divided into four main steps, policy

collection, manual annotation, automated classiﬁcation, and

policy presentation. A summary of the approach is presented

in Fig. 2. In what follows, we describe our data collection and

manual annotation steps.

A. Policy Collection

Recent statistics estimate that there are hundreds of active

DSE platforms listed on popular mobile app marketplaces [22].

In our analysis, we consider apps that operate in large geo-

graphical areas and have massive user bases. Privacy concerns

are more likely to manifest over these apps rather than

smaller ones which often have less heterogeneous user bases.

Speciﬁcally, for a DSE app to be included in our analysis, it

has to meet the following criteria:

1) The app must facilitate some sort of a P2P connection

and include the sharing of some sort of a resource, such

as a tangible asset (e.g., an apartment or a car) or a soft

skill (e.g., plumbing or hair styling).

2) The app must be available on Google Play or the Apple

App Store so that we can extract its meta-data.

3) The app must be located and/or have a substantial pres-

ence in the United States. By focusing on the U.S. market,

we ensure that our apps’ privacy policies are available in

English and that these apps offer services that are familiar

to the average U.S. user.

With these criteria in place, we selected the ﬁve most popu-

lar apps from ﬁve popular application domains of DSE [22]. In

general, ﬁve categories of DSE apps can be identiﬁed: ride-

sharing (e.g., Uber or Lyft), lodging (e.g., Airbnb), delivery

(e.g., DoorDash or UberEats), asset-sharing (e.g., GetMy-

Boat), and freelancing (e.g., TaskRabbit) [22], [40]. The top

ﬁve apps in each application domain are then identiﬁed based

on their installation and rating statistics as of January, 2021.

Table I shows the selected apps along with their popularity,

measured as the number of ratings and the average rating on

the Apple App Store as well as the average number of installs

from Google Play. We also extracted each app’s privacy policy,

which is typically posted on the app’s ofﬁcial website.

B. Manual Annotation

We start our analysis by qualitatively analyzing the content

of privacy policies of the apps in our dataset. The objective is

to identify the data collection claims in these policies along

with their justiﬁcations (i.e., establish our ground truth). We

deﬁne a justiﬁcation as the rationale provided by the app for

collecting users’ sensitive information. In our approach, we

map such rationale into a set of high-level system quality

features. Apps supposedly collect data to enhance the quality

attributes of the app, and thus, its users’ experience. These

quality attributes are often described as the non-functional

requirements of the system (NFRs). NFRs can be thought of as

abstract behaviors of the system that can be enforced through

bundles of the system’s functional features. For instance, the

security NFR refers to the behavior that is enforced by the

functional features that are used to implement security in the

system, such as user authentication and data encryption.

Around 250 different kinds of software quality attributes

are deﬁned in the literature [41]. These NFRs extend over a

broad range of categories and sub-categories. To simplify our

manual analysis, we limit our annotation to the most popular

types of NFRs that commonly appear in literature: Security,

Performance, Accessibility, Accuracy, Usability, Safety, Legal,

and Maintainability [42], [43], [44], [45]. These NFRs are

deﬁned in Table II.

To annotate the privacy claims in our set of policies, we

follow a grounded theory approach [46]. In particular, three

judges manually extracted any policy statements related to

collecting, using, sharing, or storing personal information and

mapped these statements to one or more of the NFR categories

deﬁned earlier. If no suitable category was found, the judges

were free to come up with new categories. An example of this

process is shown in Fig. 3. A statement can be labeled under

multiple categories if it raises more than one functionality-

related issue. This step was necessary to maintain the accuracy

of our annotations as NFRs are inherently vague—a single

statement can express multiple issues at the same time [44],

[41], [47]. After each round of annotation, the three judges

met to discuss any discrepancies and add/merge labels. This

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnnotatingPrivacyPoliciesintheSharingEconomyFahimehEbrahimi,MiroslavTushev,andAnasMahmoudTheDivisionofComputerScienceandEngineeringLouisianaStateUniversityfebrah1@lsu.edu,mtushe1@lsu.edu,amahmo4@lsu.eduAbstractApplications(apps)oftheDigitalSharingEconomy(DSE),suchasUber,Airbnb,andTaskRabbit,havebec...

展开>> 收起<<

Annotating Privacy Policies in the Sharing Economy Fahimeh Ebrahimi Miroslav Tushev and Anas Mahmoud The Division of Computer Science and Engineering.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Annotating Privacy Policies in the Sharing Economy Fahimeh Ebrahimi Miroslav Tushev and Anas Mahmoud The Division of Computer Science and Engineering

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: