Auditing YouTubes Recommendation Algorithm for Misinformation Filter Bubbles IVAN SRBA Kempelen Institute of Intelligent Technologies Slovakia

2025-05-02 0 0 997.96KB 34 页 10玖币
侵权投诉
Auditing YouTube’s Recommendation Algorithm for
Misinformation Filter Bubbles
IVAN SRBA,Kempelen Institute of Intelligent Technologies, Slovakia
ROBERT MORO,Kempelen Institute of Intelligent Technologies, Slovakia
MATUS TOMLEIN,Kempelen Institute of Intelligent Technologies, Slovakia
BRANISLAV PECHER,Faculty of Information Technology, Brno University of Technology, Czechia
JAKUB SIMKO,Kempelen Institute of Intelligent Technologies, Slovakia
ELENA STEFANCOVA,Kempelen Institute of Intelligent Technologies, Slovakia
MICHAL KOMPAN,Kempelen Institute of Intelligent Technologies, Slovakia
ANDREA HRCKOVA,Kempelen Institute of Intelligent Technologies, Slovakia
JURAJ PODROUZEK,Kempelen Institute of Intelligent Technologies, Slovakia
ADRIAN GAVORNIK,Kempelen Institute of Intelligent Technologies, Slovakia
MARIA BIELIKOVA,Kempelen Institute of Intelligent Technologies, Slovakia
In this paper, we present results of an auditing study performed over YouTube aimed at investigating how fast
a user can get into a misinformation lter bubble, but also what it takes to “burst the bubble, i.e., revert the
bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting
as YouTube users) delve into misinformation lter bubbles by watching misinformation promoting content.
Then they try to burst the bubbles and reach more balanced recommendations by watching misinformation
debunking content. We record search results, home page results, and recommendations for the watched videos.
Overall, we recorded 17,405 unique videos, out of which we manually annotated 2,914 for the presence of
misinformation. The labeled data was used to train a machine learning model classifying videos into three
classes (promoting, debunking, neutral) with the accuracy of 0.82. We use the trained model to classify the
remaining videos that would not be feasible to annotate manually.
Using both the manually and automatically annotated data, we observe the misinformation bubble dynamics
for a range of audited topics. Our key nding is that even though lter bubbles do not appear in some situations,
when they do, it is possible to burst them by watching misinformation debunking content (albeit it manifests
dierently from topic to topic). We also observe a sudden decrease of misinformation lter bubble eect when
misinformation debunking videos are watched after misinformation promoting videos, suggesting a strong
contextuality of recommendations. Finally, when comparing our results with a previous similar study, we do
not observe signicant improvements in the overall quantity of recommended misinformation content.
Also with Kempelen Institute of Intelligent Technologies.
Also with slovak.AI.
Also with slovak.AI.
Authors’ addresses: Ivan Srba, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, ivan.srba@kinit.sk;
Robert Moro, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, robert.moro@kinit.sk; Matus Tomlein,
Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, matus.tomlein@kinit.sk; Branislav Pecher, Faculty of
Information Technology, Brno University of Technology, Brno, Czechia, branislav.pecher@kinit.sk; Jakub Simko, Kempelen
Institute of Intelligent Technologies, Bratislava, Slovakia, jakub.simko@kinit.sk; Elena Stefancova, Kempelen Institute of
Intelligent Technologies, Bratislava, Slovakia, elena.stefancova@kinit.sk; Michal Kompan, Kempelen Institute of Intelligent
Technologies, Bratislava, Slovakia, michal.kompan@kinit.sk; Andrea Hrckova, Kempelen Institute of Intelligent Technologies,
Bratislava, Slovakia, andrea.hrckova@kinit.sk; Juraj Podrouzek, Kempelen Institute of Intelligent Technologies, Bratislava,
Slovakia, juraj.podrouzek@kinit.sk; Adrian Gavornik, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia,
adrian.gavornik@intern.kinit.sk; Maria Bielikova, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia,
maria.bielikova@kinit.sk.
©2022 Copyright held by the owner/author(s). Publication rights licensed to ACM.
This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. This work has just
been accepted to ACM Transactions on Recommender Systems (ACM TORS) , https://doi.org/10.1145/3568392.
ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.
arXiv:2210.10085v1 [cs.IR] 18 Oct 2022
0:2 Srba, et al.
CCS Concepts:
Social and professional topics Technology audits
;
Information systems Per-
sonalization;Content ranking;Human-centered computing Human computer interaction (HCI).
Additional Key Words and Phrases: audit, recommender systems, lter bubble, misinformation, personalization,
automatic labeling, ethics, YouTube
ACM Reference Format:
Ivan Srba, Robert Moro, Matus Tomlein, Branislav Pecher, Jakub Simko, Elena Stefancova, Michal Kompan,
Andrea Hrckova, Juraj Podrouzek, Adrian Gavornik, and Maria Bielikova. 2022. Auditing YouTube’s Recom-
mendation Algorithm for Misinformation Filter Bubbles. ACM Transactions on Recommender Systems (ACM
TORS) 0, 0, Article 0 ( 2022), 34 pages. https://doi.org/10.1145/3568392
1 INTRODUCTION
In this paper, we investigate the misinformation lter bubble creation and bursting on YouTube
1
. The
role of very large online platforms (especially social networking sites, such as Facebook, Twitter,
or YouTube) in dissemination and amplication of misinformation has been widely discussed and
recognized in recent years by researchers, journalists, policymakers, and representatives of the
platforms alike [
11
,
18
,
19
,
22
,
30
,
41
,
48
]. The platforms are blamed for promoting sensational,
attention-grabbing, or polarizing content through the use of personalized recommendation algo-
rithms (resulting from their mode of operation based on monetizing users’ attention [
53
,
60
]). To
tackle this issue, the platforms have (on the European level) committed to implement a range of
measures stipulated in the Code of Practice on Disinformation [
17
]. However, the monitoring of
the platforms’ compliance and the progress made in this regard has proved dicult [
16
]. One of
the problems is a lack of eective public oversight in the form of internal audits of the platforms’
personalized algorithms that could directly quantify the impact of disinformation as well as the
measures taken by the platforms.
This lack has been partially compensated by external black-box auditing studies performed by
the researchers, such as [
1
,
4
,
27
,
38
,
41
,
48
] that aimed to quantify the portion of misinformative
content being recommended on social media platforms. With respect to YouTube, which is the
subject of the audit presented in this paper, previous works investigated how a user can enter a lter
bubble. Multiple studies demonstrated that watching a series of misinformative videos strengthens
the further presence of such content in recommendations [
1
,
27
,
38
], or that following a path of
the “up next” videos can bring the user to a very dubious content [
48
]. However, no studies have
covered if,how or with what eort” can the user “burst” (lessen) the bubble. More specically, they
have not investigated what type of user’s watching behavior (e.g., switching to credible news videos
or conspiracy debunking videos) would be needed to lessen the amount of misinformative content
recommended to the user. Such knowledge would be valuable not just for the sake of having a
better understanding of the inner workings of YouTube’s personalization, but also to improve the
social, educational, or psychological strategies for building up resilience against misinformation.
Our work extends the prior works by researching this important aspect. To do so, we employ a
sock puppet auditing methodology [
5
,
43
]. We simulate user behavior on the YouTube platform,
record platform responses (search results, home page results, recommendations) and manually
annotate their sample for the presence of misinformative content. Using the manual annotations, we
train a machine learning model to predict labels for the remaining recommended videos that would
be impractical to annotate manually due to their large volume. Then, we quantify the dynamics of
1
This paper is an extended version of a paper entitled “An Audit of Misinformation Filter Bubbles on YouTube: Bubble
Bursting and Recent Behavior Changes” [
51
], which has been awarded the Best Paper Award at the Fifteenth ACM
Conference on Recommender Systems (RecSys ’21).
ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.
Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles 0:3
misinformation lter bubble creation and also of bubble bursting, which is the novel aspect of the
study.
The main contributions of this work are threefold.
As the rst contribution
, this paper reports
on the behavior of YouTube’s personalization in a situation when a user with misinformation
promoting watch history (i.e., with a developed misinformation lter bubble) starts to watch
content debunking the misinformation (in an attempt to burst that misinformation lter bubble).
The key nding is that watching misinformation debunking videos (such as credible news or
scientic content) generally improves the situation (in terms of recommended items or search
result personalization), albeit with varying eects and forms, mainly depending on the particular
misinformation topic.
Complementing manual labels with automatically predicted ones (using our trained machine
learning model) allowed us to inspect not only dierence at the specic points in time (the state
at the beginning of the study vs. the state after obtaining a watch history of misinformation
promoting videos vs. the state after watching the misinformation debunking content), but also the
dynamics of misinformation lter bubble creation and bursting throughout the whole duration
of the study. Thus,
as the second contribution
, we provide a so-far unexplored deeper insight
into misinformation lter bubble dynamics, since a continuous evaluation of the proportions of
misinformation promoting and debunking videos has not been covered, to the best of our knowledge,
by any of the existing auditing studies yet. The key nding is that there is a sudden increase in
the number of debunking videos after the rst watched debunking video, suggesting a strong
contextuality of the YouTube’s personalization algorithms. We observe this consistently for both
the home page results and the recommendations for most examined misinformation topics.
Lastly, part of this work is a replication of the prior works, most notably the work of Hussein
et al. [
27
] who also investigated the creation of misinformation lter bubbles using user simula-
tion. We aligned our methodology with Hussein’s study: we re-used Hussein’s seed data (topics,
queries, and also videos, except those which have been removed in the meantime), used similar
scenarios and the same data annotation scheme. As a result, we were able to directly compare the
outcomes of both studies, Hussein’s and ours, on the number of observed misinformative videos
present in recommendations or search results.
As the third contribution
, we report changes in
misinformation video occurrences on YouTube, which took place since the study of Hussein et
al. [
27
] (mid-2019). Due to the ongoing YouTube’s eorts to improve their recommender systems
and policies (e.g., by removing misinformative content or preferring credible sources) [
54
,
55
], we
expected to see less lter bubble creation behavior than Hussein et al. While we in general observe
low overall prevalence of misinformation in several topics, there is still a room for improvement.
More specically, we observe worse situation regarding the topics of vaccination and (partially)
9/11 conspiracies and some improvements (less misinformation) for moon landing or chemtrails
conspiracies. In addition, we replicated, to a lesser extent, the works of Hou et al. [
26
] and Pa-
padamou et al. [
38
]. We reused their classication models, which we adapted and applied on our
collected data. Finally, we used the better performing Papadamou’s model [
38
] to predict the labels
for the videos that were not labeled manually.
To ensure future replicability of our study and reproducibility of the presented results, the
implementation of the experimental infrastructure, collected data, manual annotations as well as
the analytical notebooks are all publicly available on GitHub
2
. However, to mitigate the identied
ethical risks associated with a possibility of mislabeling a video as promoting misinformation (cf.
Section 4.7), we do not publish the labels predicted by our trained machine learning model, only
their aggregated numbers necessary to reproduce the results. Nevertheless, since we train and use
2https://github.com/kinit-sk/yaudit-recsys-2021
ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.
0:4 Srba, et al.
models published in prior works [
26
,
38
], it is also possible to replicate this part of our study. In
addition, we do not publish copyrighted data (video title, description, etc.). However, this metadata
can be downloaded through YouTube API.
The rest of the paper is structured as follows. Section 2looks at the denitions of a lter bubble,
echo chambers, and misinformation and uses them to dene a misinformation lter bubble which
is the focus of this work. In Section 3, we analyze dierent types of auditing studies and what the
previous works examined with respect to social media (YouTube in particular) and misinformation.
In Section 4, we describe in detail our research questions and study methodology, including manual
as well as automatic annotation of collected data. We present the results of a training of a machine
learning model for automatic annotation of videos and compare these results with the relevant
related works. We also discuss identied ethical issues and how we addressed them in our research.
The results of the audit itself are presented in Section 5. Finally, the implications of the results and
the conclusions are discussed in Section 6.
2 BACKGROUND: FILTER BUBBLES AND MISINFORMATION
In online social media, a phenomenon denoted as virtual echo chambers refers to situations in which
the same ideas are repeated, mutually conrmed and amplied in relatively closed homogeneous
groups. Echo chambers may nally contribute into increased polarization and fragmentation of the
society [
50
,
59
]. Tendency of users to join and remain in echo chambers can be intentional (also
denoted as self-selected personalization [
7
]) and explained by their selective exposure (focusing on
information that is in accordance with one’s worldview) or conrmation bias (reinforcing one’s
pre-existing beliefs) [
6
]. To some extent such users’ behavior is a natural human defense against
information overload [
36
]. However, enclosure of users into echo chambers can be also caused
and amplied by adaptive systems (even without users’ intention) resulting into the so-called
lter bubble eect (also denoted as pre-selected personalization [
7
]). This eect has serious ethical
implications—users are often unaware of the existence of lter bubbles, as well as of the information
that was ltered out.
Filter bubbles were rstly recognized by Pariser [
39
] in 2011 as a state of intellectual isolation
caused by algorithms that personalize users’ online experiences hence exposing users to information
and opinions that conform to and reinforce their own beliefs. Filter bubbles quickly became the
target of theoretical as well as empirical research eorts from multiple perspectives, such as: 1)
exploring characteristics of lter bubbles and identication of circumstances of their creation [
32
];
2) modelling/quantifying the lter bubble eect [
2
,
28
]; and 3) discovering strategies how to prevent
or “burst” lter bubbles [10].
The ambiguity and dicult operationalization of the original Pariser’s denition of lter bubbles
led to its dierent interpretations; inconsistent or even contrasting ndings; and nally, also to low
generalizability across the studies [35]. For these reasons, the operationalized, systematically and
empirically veriable denition of the lter bubble has been recently proposed in [
35
] as follows:
A technological lter bubble is a decrease in the diversity of a user’s recommendations over
time, in any dimension of diversity, resulting from the choices made by dierent recommendation
stakeholders.” Based on this denition, authors also stress the criteria on studies addressing the
lter bubble eect—they must consider the diversity of recommendations and measure a decrease in
diversity over time. In this work, we proceed from this denition and also meet the stated criteria.
We are interested specically in lter bubbles that are determined by the presence of content
spreading disinformation or misinformation. Disinformation is a “false, inaccurate, or misleading
information designed, presented and promoted to intentionally cause public harm or for prot” [
19
],
while misinformation is a false or inaccurate information that is spread regardless of an intention
to deceive. In this work, we use a broader term of misinformation since we are interested in any
ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.
Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles 0:5
kind of false or inaccurate information regardless of intention behind its creation and spreading (in
contrast to the most current reports and EU legislation that opt for the term disinformation and
thus emphasize the necessary presence of intention). Due to signicant negative consequences
of misinformation on our society (especially during the ongoing COVID-19 pandemic), tackling
misinformation attracted also a plethora of research works (see [
9
,
56
,
58
] for recent surveys). The
majority of such research focuses on various characterization studies [
46
] or detection methods [
40
,
49].
We denote lter bubbles that are characterized by the increased prevalence of such misinformative
content as misinformation lter bubbles. They are states of intellectual isolation in false beliefs
or manipulated perceptions of reality. Following the adopted denition, lter bubbles in general
are characterized by the decrease in any dimension of diversity. We can broadly distinguish three
types of diversity [
35
]: structural, topical and viewpoint diversity. Misinformation lter bubbles
can be considered as a special case of a decrease of viewpoint diversity in which the viewpoints
represented are provably false. Analogically to topical lter bubbles, misinformation lter bubbles
can be characterized by a high homogeneity of recommendations/search results that share the
same positive stance towards misinformation. In other words, the content adaptively presented to a
user in a misinformation lter bubble supports one or several false claims/narratives. While topical
lter bubbles are not necessary undesirable (they may be intended and even positively perceived
by users [
8
,
30
]), misinformation lter bubbles are by denition more problematic and cause an
indisputable negative eects [12,21,30].
Reecting the adopted denition of the lter bubble [
35
], we do not consider misinformation
lter bubble as a binary state at a single moment (i.e., following the current recommended items, a
user is/is not in the misinformation lter bubble), but as the interval measure reecting how deep
inside the bubble the user is. Such a measure is determined by the proportion of misinformative
content and calculated at dierent points in time. This denition and operationalization of lter
bubbles emphasizes our second contribution in so-far unexplored deeper insight into the dynamics
of lter bubbles since we calculated the proportion of misinformative content not only in the
selected time points but continuously over the whole duration of the study.
To prevent misinformation and misinformation lter bubbles, social media conduct various
countermeasures. These are usually reactions to public outcry or are required by legislation, e.g.,
EU’s Code of Practice on Disinformation [
17
]. Currently, the eectiveness of such countermeasures
is evaluated mainly by self-evaluation reports. This approach has been, however, already recognized
as insucient due to a lack of evidence and objectivity since social media are reluctant to provide
access to their data for independent research [
16
]. In addition, the commercial aims of social
media may be contradicting pro-social interests, as also revealed by the recent whistleblowing case
Facebook Files
3
[
25
]. The verication of countermeasures is further complicated by interference
of psychological factors. For example, some researchers argue that users’ intentional self-selected
personalization is more inuential than algorithms’ pre-selected personalization when it comes to
intellectual isolation [3,15].
An alternative solution towards responsible and governed AI in social media and eliminating
its negative social impact is employment of independent audits. Such audits, which are carried
out by an external auditor independent from the company developing the audited AI algorithm,
are envisaged also in the proposal of an upcoming EU legislation [
20
]. Nevertheless, the auditing
3
Facebook Files denote a leak of internal documents revealing that the company was aware of negative societal impact
caused by the platform, including its algorithms, like spreading and preferring harmful or controversial content. At the
same time, documents showed that the company’s reactions on such caused real-world harms were not sucient.
ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.
摘要:

AuditingYouTube’sRecommendationAlgorithmforMisinformationFilterBubblesIVANSRBA,KempelenInstituteofIntelligentTechnologies,SlovakiaROBERTMORO,KempelenInstituteofIntelligentTechnologies,SlovakiaMATUSTOMLEIN,KempelenInstituteofIntelligentTechnologies,SlovakiaBRANISLAVPECHER∗,FacultyofInformationTechnol...

展开>> 收起<<
Auditing YouTubes Recommendation Algorithm for Misinformation Filter Bubbles IVAN SRBA Kempelen Institute of Intelligent Technologies Slovakia.pdf

共34页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:34 页 大小:997.96KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 34
客服
关注