Annotating Privacy Policies in the Sharing Economy Fahimeh Ebrahimi Miroslav Tushev and Anas Mahmoud The Division of Computer Science and Engineering

2025-04-30 0 0 513.22KB 12 页 10玖币
侵权投诉
Annotating Privacy Policies in the Sharing Economy
Fahimeh Ebrahimi, Miroslav Tushev, and Anas Mahmoud
The Division of Computer Science and Engineering
Louisiana State University
febrah1@lsu.edu, mtushe1@lsu.edu, amahmo4@lsu.edu
Abstract—Applications (apps) of the Digital Sharing Economy
(DSE), such as Uber, Airbnb, and TaskRabbit, have become
a main enabler of economic growth and shared prosperity in
modern-day societies. However, the complex exchange of goods,
services, and data that takes place over these apps frequently
puts their end-users’ privacy at risk. Privacy policies of DSE apps
are provided to disclose how private user data is being collected
and handled. However, in reality, such policies are verbose and
difficult to understand, leaving DSE users vulnerable to privacy
intrusive practices. To address these concerns, in this paper, we
propose an automated approach for annotating privacy policies in
the DSE market. Our approach identifies data collection claims in
these policies and maps them to the quality features of their apps.
Visual and textual annotations are then used to further explain
and justify these claims. The proposed approach is evaluated
with 18 DSE app users. The results show that annotating privacy
policies can significantly enhance their comprehensibility to the
average DSE user. Our findings are intended to help DSE app
developers to draft more comprehensible privacy policies as well
as help their end-users to make more informed decisions in one
of the fastest growing software ecosystems in the world.
Index Terms—Privacy Policy, Sharing Economy, Annotation
I. INTRODUCTION
The Digital Sharing Economy (DSE), also known as the gig
or shared economy, refers to a sustainable form of business
exchange that provides access to, rather than ownership of,
assets and resources via direct Peer-to-Peer (P2P) coordina-
tion [1]. Over the past decade, applications of the Sharing
Economy, such as Uber, TaskRabbit, and Airbnb, have caused
major disturbances in established classical markets, enabling
people to exchange and monetize their idle or underused assets
at unprecedented scales. This unique form of collaborative
consumption has been linked to significant levels of eco-
nomic growth, helping unemployed and partially employed
individuals to generate income, increase reciprocity, and access
resources that can be unattainable otherwise [2], [3], [4]. As of
today, there are thousands of active DSE platforms, operating
in a market sector that is projected to grow to close to 335
billion U.S. dollars by 2025 [5].
In order to mediate business transactions, DSE apps con-
stantly demand access to private user information, including
their credit information, geo-location, and even photos of the
assets being shared (e.g., a vehicle or an apartment) [6],
[7], [8], [9]. Such information is used to establish the P2P
connection between service providers and receivers, facilitate
identification offline, and optimize and manage transactions.
Modern app stores require apps that operate on users’ sensitive
information to provide privacy policies in which their data
practices are declared and justified [10]. These policies are
intended to provide information about the types of user infor-
mation the app collects, how that data is being used, shared,
transferred, and protected [11], [12]. However, in reality,
these policies are often verbose, ambiguous, and jargon-heavy,
making them difficult to understand by the average user [13],
[14]. This can be particularly problematic in the DSE market,
where an unintended leak of users’ private information can
expose them to great physical and financial risks [7], [8], [9].
To help overcome these limitations, in this paper, we pro-
pose a novel approach for annotating privacy policies in the
DSE market [15]. Our objective is to make these policies more
comprehensible to the average DSE user. Providing easy-to-
understand policies can protect users from the intrusive privacy
tactics of apps and help them make more informed socio-
economic decisions when it comes to navigating the landscape
of DSE platforms. Technically, our approach utilizes text
classification techniques to extract data collection practices
from DSE privacy policies and then map these practices
(claims) to the quality features of their apps. The extracted in-
formation is then color-coded, annotated, and presented using
an enhanced hypertext format. The impact of our annotations
is then evaluated through a user study with 18 DSE app users.
The remainder of this paper is organized as follows. Sec-
tion II discusses privacy in the DSE market and motivates
our research. Section III describes our data collection and
analysis process. Section IV presents our automated policy
annotation approach. Section V empirically evaluates our
proposed approach. Section VI discusses our findings and
their implications. Section VII addresses the limitations of our
study. Finally, Section VIII concludes the paper.
II. BACKGROUND AND MOTIVATION
In this section, we discuss user privacy in the DSE market
and review existing research on privacy policy annotation. We
then motivate our work and present our research questions.
A. Privacy in the Digital Sharing Economy
The proliferation of DSE apps over the past decade along
with their unique operational characteristics have imposed new
challenges on end-users privacy. Privacy in the DSE market
is a compounded concept, including interrelated concerns of
data and physical privacy [6]. Fig. 1 shows the different
channels of data exchange in a typical DSE transaction [7].
In order to be approved as a service provider (e.g., Uber
drivers or Airbnb hosts), users need to disclose their personally
arXiv:2210.14993v1 [cs.CR] 26 Oct 2022
Organization
ProviderConsumer
Good / Service
Access to
the platform
Access to the
platform (+ earnings)
Provider data
Consumer data
(payment)
Fig. 1: A summary of data exchange between users and
organizations in DSE apps [7].
identifiable information (PII) to DSE organizations (e.g., Uber
or Airbnb), including their legal name, address, and picture
along with their shared property’s information, such as the
model of the vehicle being shared or the number of rooms
in the rental space. Service consumers (e.g. Uber riders or
Airbnb renters) also need to share their PII and financial
information with DSE companies in order for their service
requests to be approved. DSE companies use their users’
information to mediate their transactions (e.g., route the Uber
driver) and charge for the service. Some of this information
is also made available to service providers and receivers to
facilitate identification offline.
Psychical privacy in the Sharing Economy refers to the
threats that are often associated with sharing personal re-
sources with strangers [16], [9]. Such concerns tend to be
domain-specific, influenced by the nature of the DSE trans-
action. For instance, recent research has revealed that the
presence of smart devices in Airbnb rentals can raise privacy
concerns among guests, including concerns of excessive mon-
itoring and access to their Internet history [8]. A survey of
Uber riders exposed common concerns of stalking, coercion,
and habit discovery among both riders and drivers [9]. DSE
platforms attempt to mediate these concerns by establishing
trust in the process. Trust is a key user goal in DSE [17],
[18], [16], [19]. Providers and receivers at both ends of the
P2P connection need to maintain a minimum level of mutual
trust before a transaction can take place [19], [18]. However,
in order to establish trust, apps demand the disclosure of
even more personal information (e.g. social media accounts
or hobbies) to make their users appear more trustworthy [20],
[21].
In general, privacy in the DSE market is a multidimensional
problem. The problem is further exacerbated by the fact
that DSE apps extend over a broad range of application
domains [22]. Identifying and mitigating privacy threats in
each specific domain can be a very challenging task, and the
failure to do so can lead to a decline in sharing intensity [23].
This can prevent users, especially in communities at the lower
end of the economic ladder, from reaching their full economic
and social potential in the DSE market [7].
B. Privacy Policy Annotation
Privacy policies are unilateral contracts by which organiza-
tions describe their data practices and inform end-users about
their data collection, usage, and sharing practices. Popular app
marketplaces, such as Google Play and the Apple App Store,
require apps to post their privacy policies online. However,
recent analysis of privacy policies in the mobile app market
revealed that most of these policies are ambiguous, and
oftentimes contain incomplete or deceptive information [24].
This undermines the utility of these policies and diminishes
users’ trust in their apps [25].
Several methods have been proposed in the literature to
enhance the quality of privacy policies. Such techniques
commonly rely on annotating the content of these policies
with supplemental information to clarify ambiguous privacy
claims and justify data usage practices. Annotations are often
carried out manually, either by domain experts or using crowd-
sourcing [26], [27], [28], [14], [25]. Expert annotations, while
usually accurate, are frequently associated with significant
effort. For instance, forming multidisciplinary expert teams
for specific application domains can be a time-consuming
process that hardly scales up. Crowd-sourcing can also lead
to suboptimal results due to the general lack of technical and
legal knowledge necessary to produce correct annotations [15],
[29], [24], [27], [28].
To overcome these limitations, several attempts have been
made to automate the annotation process [15]. For instance,
Wilson et al. [15] annotated a corpus of 128 policies to
23,000 fine-grained data practices to be used as a benchmark
in automated annotation tasks. In general, automated tech-
niques use Natural Language Processing (NLP) and Machine
Learning (ML) to identify salient paragraphs that describe
data practices, such as data collection, sharing, security, and
retention in privacy policies [15], [30]. The main limitation
of such methods is that they are commonly applied as generic
solutions. This can impact the validity of these techniques due
to the fact that privacy concerns tend to be domain-specific.
For instance, constant location tracking is more justifiable in
ride-hailing apps than in social media apps. Therefore, a one-
size-fits-all solution often generates misleading results.
C. Motivation and Research Questions
Our brief review shows that people’s participation in sharing
activities might be hindered by the unjustified privacy practices
of DSE apps. In general, people might abstain from sharing
if they cannot justify the trade-off between the benefits and
the risks of using DSE apps. DSE apps try to mitigate these
concerns in their privacy policies. However, the majority of
these policies are extremely lengthy, and oftentimes ambigu-
ous, leaving users exposed to unlawful privacy practices,
such as constantly and unnecessarily tracking their location,
learning their routine habits, and inferring their consumption
preferences [9], [8]. For instance, the average length of privacy
policies of popular DSE apps (Table I) is around 5,425
words, which is significantly longer than the average length of
online privacy policies (2,000 3,000 words) [31], [32], [33].
Furthermore, the average readability of these apps’ policies, as
measured by the Flesch Reading Ease (FRE) [34], [35], [36],
is around 29, which indicates that they are better understood
by college graduates. This can be particularly problematic in
the DSE market as recent statistics showed that a considerable
percentage of DSE users either do not possess a college-level
education or still college students [37], [5]. These observations
emphasize the need for a supporting mechanism to help
DSE users better assess the privacy risks of disclosing their
personal information to DSE apps. In fact, such a mechanism
can also help policymakers and regulators to better evaluate
the privacy practices of DSE apps, and consequently, devise
legislation to protect workers’ and consumers’ rights in one of
the fastest-growing, most diverse, yet under-regulated software
ecosystems in the world [38], [39].
Motivated by these observations, in this paper, we propose
a novel approach for annotating privacy policies in the DSE
market. Our approach automatically classifies data claims in
these policies and maps them to the quality features of the
app (e.g., security, safety, customizability, etc.). The main
assumption is that these generic categories of abstract system
features can be more easily comprehensible for the average
user. In particular, our research questions are:
RQ1:Can the privacy policies of DSE apps be auto-
matically annotated? Under this research question, we
investigate the effectiveness of several automated classi-
fication techniques in annotating data collection claims in
the privacy policies of DSE apps.
RQ2:Are annotated privacy policies more comprehen-
sible to the average DSE user? Under this research
question, we explore whether our annotated policies can
actually help average DSE users to better understand the
privacy practices of their DSE apps.
III. DATA AND MANUAL ANNOTATION
Our approach can be divided into four main steps, policy
collection, manual annotation, automated classification, and
policy presentation. A summary of the approach is presented
in Fig. 2. In what follows, we describe our data collection and
manual annotation steps.
A. Policy Collection
Recent statistics estimate that there are hundreds of active
DSE platforms listed on popular mobile app marketplaces [22].
In our analysis, we consider apps that operate in large geo-
graphical areas and have massive user bases. Privacy concerns
are more likely to manifest over these apps rather than
smaller ones which often have less heterogeneous user bases.
Specifically, for a DSE app to be included in our analysis, it
has to meet the following criteria:
1) The app must facilitate some sort of a P2P connection
and include the sharing of some sort of a resource, such
as a tangible asset (e.g., an apartment or a car) or a soft
skill (e.g., plumbing or hair styling).
2) The app must be available on Google Play or the Apple
App Store so that we can extract its meta-data.
3) The app must be located and/or have a substantial pres-
ence in the United States. By focusing on the U.S. market,
we ensure that our apps’ privacy policies are available in
English and that these apps offer services that are familiar
to the average U.S. user.
With these criteria in place, we selected the five most popu-
lar apps from five popular application domains of DSE [22]. In
general, five categories of DSE apps can be identified: ride-
sharing (e.g., Uber or Lyft), lodging (e.g., Airbnb), delivery
(e.g., DoorDash or UberEats), asset-sharing (e.g., GetMy-
Boat), and freelancing (e.g., TaskRabbit) [22], [40]. The top
five apps in each application domain are then identified based
on their installation and rating statistics as of January, 2021.
Table I shows the selected apps along with their popularity,
measured as the number of ratings and the average rating on
the Apple App Store as well as the average number of installs
from Google Play. We also extracted each app’s privacy policy,
which is typically posted on the app’s official website.
B. Manual Annotation
We start our analysis by qualitatively analyzing the content
of privacy policies of the apps in our dataset. The objective is
to identify the data collection claims in these policies along
with their justifications (i.e., establish our ground truth). We
define a justification as the rationale provided by the app for
collecting users’ sensitive information. In our approach, we
map such rationale into a set of high-level system quality
features. Apps supposedly collect data to enhance the quality
attributes of the app, and thus, its users’ experience. These
quality attributes are often described as the non-functional
requirements of the system (NFRs). NFRs can be thought of as
abstract behaviors of the system that can be enforced through
bundles of the system’s functional features. For instance, the
security NFR refers to the behavior that is enforced by the
functional features that are used to implement security in the
system, such as user authentication and data encryption.
Around 250 different kinds of software quality attributes
are defined in the literature [41]. These NFRs extend over a
broad range of categories and sub-categories. To simplify our
manual analysis, we limit our annotation to the most popular
types of NFRs that commonly appear in literature: Security,
Performance, Accessibility, Accuracy, Usability, Safety, Legal,
and Maintainability [42], [43], [44], [45]. These NFRs are
defined in Table II.
To annotate the privacy claims in our set of policies, we
follow a grounded theory approach [46]. In particular, three
judges manually extracted any policy statements related to
collecting, using, sharing, or storing personal information and
mapped these statements to one or more of the NFR categories
defined earlier. If no suitable category was found, the judges
were free to come up with new categories. An example of this
process is shown in Fig. 3. A statement can be labeled under
multiple categories if it raises more than one functionality-
related issue. This step was necessary to maintain the accuracy
of our annotations as NFRs are inherently vague—a single
statement can express multiple issues at the same time [44],
[41], [47]. After each round of annotation, the three judges
met to discuss any discrepancies and add/merge labels. This
摘要:

AnnotatingPrivacyPoliciesintheSharingEconomyFahimehEbrahimi,MiroslavTushev,andAnasMahmoudTheDivisionofComputerScienceandEngineeringLouisianaStateUniversityfebrah1@lsu.edu,mtushe1@lsu.edu,amahmo4@lsu.eduAbstract—Applications(apps)oftheDigitalSharingEconomy(DSE),suchasUber,Airbnb,andTaskRabbit,havebec...

展开>> 收起<<
Annotating Privacy Policies in the Sharing Economy Fahimeh Ebrahimi Miroslav Tushev and Anas Mahmoud The Division of Computer Science and Engineering.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:513.22KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注