Auditing YouTubes Recommendation Algorithm for Misinformation Filter Bubbles IVAN SRBA Kempelen Institute of Intelligent Technologies Slovakia

2025-05-02 0 0 997.96KB 34 页 10玖币

侵权投诉

Auditing YouTube’s Recommendation Algorithm for

Misinformation Filter Bubbles

IVAN SRBA,Kempelen Institute of Intelligent Technologies, Slovakia

ROBERT MORO,Kempelen Institute of Intelligent Technologies, Slovakia

MATUS TOMLEIN,Kempelen Institute of Intelligent Technologies, Slovakia

BRANISLAV PECHER∗,Faculty of Information Technology, Brno University of Technology, Czechia

JAKUB SIMKO,Kempelen Institute of Intelligent Technologies, Slovakia

ELENA STEFANCOVA,Kempelen Institute of Intelligent Technologies, Slovakia

MICHAL KOMPAN†,Kempelen Institute of Intelligent Technologies, Slovakia

ANDREA HRCKOVA,Kempelen Institute of Intelligent Technologies, Slovakia

JURAJ PODROUZEK,Kempelen Institute of Intelligent Technologies, Slovakia

ADRIAN GAVORNIK,Kempelen Institute of Intelligent Technologies, Slovakia

MARIA BIELIKOVA‡,Kempelen Institute of Intelligent Technologies, Slovakia

In this paper, we present results of an auditing study performed over YouTube aimed at investigating how fast

a user can get into a misinformation lter bubble, but also what it takes to “burst the bubble”, i.e., revert the

bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting

as YouTube users) delve into misinformation lter bubbles by watching misinformation promoting content.

Then they try to burst the bubbles and reach more balanced recommendations by watching misinformation

debunking content. We record search results, home page results, and recommendations for the watched videos.

Overall, we recorded 17,405 unique videos, out of which we manually annotated 2,914 for the presence of

misinformation. The labeled data was used to train a machine learning model classifying videos into three

classes (promoting, debunking, neutral) with the accuracy of 0.82. We use the trained model to classify the

remaining videos that would not be feasible to annotate manually.

Using both the manually and automatically annotated data, we observe the misinformation bubble dynamics

for a range of audited topics. Our key nding is that even though lter bubbles do not appear in some situations,

when they do, it is possible to burst them by watching misinformation debunking content (albeit it manifests

dierently from topic to topic). We also observe a sudden decrease of misinformation lter bubble eect when

misinformation debunking videos are watched after misinformation promoting videos, suggesting a strong

contextuality of recommendations. Finally, when comparing our results with a previous similar study, we do

not observe signicant improvements in the overall quantity of recommended misinformation content.

∗Also with Kempelen Institute of Intelligent Technologies.

†Also with slovak.AI.

‡Also with slovak.AI.

Authors’ addresses: Ivan Srba, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, ivan.srba@kinit.sk;

Robert Moro, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, robert.moro@kinit.sk; Matus Tomlein,

Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia, matus.tomlein@kinit.sk; Branislav Pecher, Faculty of

Information Technology, Brno University of Technology, Brno, Czechia, branislav.pecher@kinit.sk; Jakub Simko, Kempelen

Institute of Intelligent Technologies, Bratislava, Slovakia, jakub.simko@kinit.sk; Elena Stefancova, Kempelen Institute of

Intelligent Technologies, Bratislava, Slovakia, elena.stefancova@kinit.sk; Michal Kompan, Kempelen Institute of Intelligent

Technologies, Bratislava, Slovakia, michal.kompan@kinit.sk; Andrea Hrckova, Kempelen Institute of Intelligent Technologies,

Bratislava, Slovakia, andrea.hrckova@kinit.sk; Juraj Podrouzek, Kempelen Institute of Intelligent Technologies, Bratislava,

Slovakia, juraj.podrouzek@kinit.sk; Adrian Gavornik, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia,

adrian.gavornik@intern.kinit.sk; Maria Bielikova, Kempelen Institute of Intelligent Technologies, Bratislava, Slovakia,

maria.bielikova@kinit.sk.

This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. This work has just

been accepted to ACM Transactions on Recommender Systems (ACM TORS) , https://doi.org/10.1145/3568392.

ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.

arXiv:2210.10085v1 [cs.IR] 18 Oct 2022

0:2 Srba, et al.

CCS Concepts:

•Social and professional topics →Technology audits

;

•Information systems →Per-

sonalization;Content ranking;•Human-centered computing →Human computer interaction (HCI).

Additional Key Words and Phrases: audit, recommender systems, lter bubble, misinformation, personalization,

automatic labeling, ethics, YouTube

ACM Reference Format:

Ivan Srba, Robert Moro, Matus Tomlein, Branislav Pecher, Jakub Simko, Elena Stefancova, Michal Kompan,

Andrea Hrckova, Juraj Podrouzek, Adrian Gavornik, and Maria Bielikova. 2022. Auditing YouTube’s Recom-

mendation Algorithm for Misinformation Filter Bubbles. ACM Transactions on Recommender Systems (ACM

TORS) 0, 0, Article 0 ( 2022), 34 pages. https://doi.org/10.1145/3568392

1 INTRODUCTION

In this paper, we investigate the misinformation lter bubble creation and bursting on YouTube

. The

role of very large online platforms (especially social networking sites, such as Facebook, Twitter,

or YouTube) in dissemination and amplication of misinformation has been widely discussed and

recognized in recent years by researchers, journalists, policymakers, and representatives of the

platforms alike [

]. The platforms are blamed for promoting sensational,

attention-grabbing, or polarizing content through the use of personalized recommendation algo-

rithms (resulting from their mode of operation based on monetizing users’ attention [

]). To

tackle this issue, the platforms have (on the European level) committed to implement a range of

measures stipulated in the Code of Practice on Disinformation [

]. However, the monitoring of

the platforms’ compliance and the progress made in this regard has proved dicult [

]. One of

the problems is a lack of eective public oversight in the form of internal audits of the platforms’

personalized algorithms that could directly quantify the impact of disinformation as well as the

measures taken by the platforms.

This lack has been partially compensated by external black-box auditing studies performed by

the researchers, such as [

] that aimed to quantify the portion of misinformative

content being recommended on social media platforms. With respect to YouTube, which is the

subject of the audit presented in this paper, previous works investigated how a user can enter a lter

bubble. Multiple studies demonstrated that watching a series of misinformative videos strengthens

the further presence of such content in recommendations [

], or that following a path of

the “up next” videos can bring the user to a very dubious content [

]. However, no studies have

covered if,how or with what “eort” can the user “burst” (lessen) the bubble. More specically, they

have not investigated what type of user’s watching behavior (e.g., switching to credible news videos

or conspiracy debunking videos) would be needed to lessen the amount of misinformative content

recommended to the user. Such knowledge would be valuable not just for the sake of having a

better understanding of the inner workings of YouTube’s personalization, but also to improve the

social, educational, or psychological strategies for building up resilience against misinformation.

Our work extends the prior works by researching this important aspect. To do so, we employ a

sock puppet auditing methodology [

]. We simulate user behavior on the YouTube platform,

record platform responses (search results, home page results, recommendations) and manually

annotate their sample for the presence of misinformative content. Using the manual annotations, we

train a machine learning model to predict labels for the remaining recommended videos that would

be impractical to annotate manually due to their large volume. Then, we quantify the dynamics of

This paper is an extended version of a paper entitled “An Audit of Misinformation Filter Bubbles on YouTube: Bubble

Bursting and Recent Behavior Changes” [

], which has been awarded the Best Paper Award at the Fifteenth ACM

Conference on Recommender Systems (RecSys ’21).

ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.

Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles 0:3

misinformation lter bubble creation and also of bubble bursting, which is the novel aspect of the

study.

The main contributions of this work are threefold.

As the rst contribution

, this paper reports

on the behavior of YouTube’s personalization in a situation when a user with misinformation

promoting watch history (i.e., with a developed misinformation lter bubble) starts to watch

content debunking the misinformation (in an attempt to burst that misinformation lter bubble).

The key nding is that watching misinformation debunking videos (such as credible news or

scientic content) generally improves the situation (in terms of recommended items or search

result personalization), albeit with varying eects and forms, mainly depending on the particular

misinformation topic.

Complementing manual labels with automatically predicted ones (using our trained machine

learning model) allowed us to inspect not only dierence at the specic points in time (the state

at the beginning of the study vs. the state after obtaining a watch history of misinformation

promoting videos vs. the state after watching the misinformation debunking content), but also the

dynamics of misinformation lter bubble creation and bursting throughout the whole duration

of the study. Thus,

as the second contribution

, we provide a so-far unexplored deeper insight

into misinformation lter bubble dynamics, since a continuous evaluation of the proportions of

misinformation promoting and debunking videos has not been covered, to the best of our knowledge,

by any of the existing auditing studies yet. The key nding is that there is a sudden increase in

the number of debunking videos after the rst watched debunking video, suggesting a strong

contextuality of the YouTube’s personalization algorithms. We observe this consistently for both

the home page results and the recommendations for most examined misinformation topics.

Lastly, part of this work is a replication of the prior works, most notably the work of Hussein

et al. [

] who also investigated the creation of misinformation lter bubbles using user simula-

tion. We aligned our methodology with Hussein’s study: we re-used Hussein’s seed data (topics,

queries, and also videos, except those which have been removed in the meantime), used similar

scenarios and the same data annotation scheme. As a result, we were able to directly compare the

outcomes of both studies, Hussein’s and ours, on the number of observed misinformative videos

present in recommendations or search results.

As the third contribution

, we report changes in

misinformation video occurrences on YouTube, which took place since the study of Hussein et

al. [

] (mid-2019). Due to the ongoing YouTube’s eorts to improve their recommender systems

and policies (e.g., by removing misinformative content or preferring credible sources) [

], we

expected to see less lter bubble creation behavior than Hussein et al. While we in general observe

low overall prevalence of misinformation in several topics, there is still a room for improvement.

More specically, we observe worse situation regarding the topics of vaccination and (partially)

9/11 conspiracies and some improvements (less misinformation) for moon landing or chemtrails

conspiracies. In addition, we replicated, to a lesser extent, the works of Hou et al. [

] and Pa-

padamou et al. [

]. We reused their classication models, which we adapted and applied on our

collected data. Finally, we used the better performing Papadamou’s model [

] to predict the labels

for the videos that were not labeled manually.

To ensure future replicability of our study and reproducibility of the presented results, the

implementation of the experimental infrastructure, collected data, manual annotations as well as

the analytical notebooks are all publicly available on GitHub

. However, to mitigate the identied

ethical risks associated with a possibility of mislabeling a video as promoting misinformation (cf.

Section 4.7), we do not publish the labels predicted by our trained machine learning model, only

their aggregated numbers necessary to reproduce the results. Nevertheless, since we train and use

2https://github.com/kinit-sk/yaudit-recsys-2021

ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.

0:4 Srba, et al.

models published in prior works [

], it is also possible to replicate this part of our study. In

addition, we do not publish copyrighted data (video title, description, etc.). However, this metadata

can be downloaded through YouTube API.

The rest of the paper is structured as follows. Section 2looks at the denitions of a lter bubble,

echo chambers, and misinformation and uses them to dene a misinformation lter bubble which

is the focus of this work. In Section 3, we analyze dierent types of auditing studies and what the

previous works examined with respect to social media (YouTube in particular) and misinformation.

In Section 4, we describe in detail our research questions and study methodology, including manual

as well as automatic annotation of collected data. We present the results of a training of a machine

learning model for automatic annotation of videos and compare these results with the relevant

related works. We also discuss identied ethical issues and how we addressed them in our research.

The results of the audit itself are presented in Section 5. Finally, the implications of the results and

the conclusions are discussed in Section 6.

2 BACKGROUND: FILTER BUBBLES AND MISINFORMATION

In online social media, a phenomenon denoted as virtual echo chambers refers to situations in which

the same ideas are repeated, mutually conrmed and amplied in relatively closed homogeneous

groups. Echo chambers may nally contribute into increased polarization and fragmentation of the

society [

]. Tendency of users to join and remain in echo chambers can be intentional (also

denoted as self-selected personalization [

]) and explained by their selective exposure (focusing on

information that is in accordance with one’s worldview) or conrmation bias (reinforcing one’s

pre-existing beliefs) [

]. To some extent such users’ behavior is a natural human defense against

information overload [

]. However, enclosure of users into echo chambers can be also caused

and amplied by adaptive systems (even without users’ intention) resulting into the so-called

lter bubble eect (also denoted as pre-selected personalization [

]). This eect has serious ethical

implications—users are often unaware of the existence of lter bubbles, as well as of the information

that was ltered out.

Filter bubbles were rstly recognized by Pariser [

] in 2011 as a state of intellectual isolation

caused by algorithms that personalize users’ online experiences hence exposing users to information

and opinions that conform to and reinforce their own beliefs. Filter bubbles quickly became the

target of theoretical as well as empirical research eorts from multiple perspectives, such as: 1)

exploring characteristics of lter bubbles and identication of circumstances of their creation [

];

2) modelling/quantifying the lter bubble eect [

]; and 3) discovering strategies how to prevent

or “burst” lter bubbles [10].

The ambiguity and dicult operationalization of the original Pariser’s denition of lter bubbles

led to its dierent interpretations; inconsistent or even contrasting ndings; and nally, also to low

generalizability across the studies [35]. For these reasons, the operationalized, systematically and

empirically veriable denition of the lter bubble has been recently proposed in [

] as follows:

“A technological lter bubble is a decrease in the diversity of a user’s recommendations over

time, in any dimension of diversity, resulting from the choices made by dierent recommendation

stakeholders.” Based on this denition, authors also stress the criteria on studies addressing the

lter bubble eect—they must consider the diversity of recommendations and measure a decrease in

diversity over time. In this work, we proceed from this denition and also meet the stated criteria.

We are interested specically in lter bubbles that are determined by the presence of content

spreading disinformation or misinformation. Disinformation is a “false, inaccurate, or misleading

information designed, presented and promoted to intentionally cause public harm or for prot” [

while misinformation is a false or inaccurate information that is spread regardless of an intention

to deceive. In this work, we use a broader term of misinformation since we are interested in any

ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.

Auditing YouTube’s Recommendation Algorithm for Misinformation Filter Bubbles 0:5

kind of false or inaccurate information regardless of intention behind its creation and spreading (in

contrast to the most current reports and EU legislation that opt for the term disinformation and

thus emphasize the necessary presence of intention). Due to signicant negative consequences

of misinformation on our society (especially during the ongoing COVID-19 pandemic), tackling

misinformation attracted also a plethora of research works (see [

] for recent surveys). The

majority of such research focuses on various characterization studies [

] or detection methods [

49].

We denote lter bubbles that are characterized by the increased prevalence of such misinformative

content as misinformation lter bubbles. They are states of intellectual isolation in false beliefs

or manipulated perceptions of reality. Following the adopted denition, lter bubbles in general

are characterized by the decrease in any dimension of diversity. We can broadly distinguish three

types of diversity [

]: structural, topical and viewpoint diversity. Misinformation lter bubbles

can be considered as a special case of a decrease of viewpoint diversity in which the viewpoints

represented are provably false. Analogically to topical lter bubbles, misinformation lter bubbles

can be characterized by a high homogeneity of recommendations/search results that share the

same positive stance towards misinformation. In other words, the content adaptively presented to a

user in a misinformation lter bubble supports one or several false claims/narratives. While topical

lter bubbles are not necessary undesirable (they may be intended and even positively perceived

by users [

]), misinformation lter bubbles are by denition more problematic and cause an

indisputable negative eects [12,21,30].

Reecting the adopted denition of the lter bubble [

], we do not consider misinformation

lter bubble as a binary state at a single moment (i.e., following the current recommended items, a

user is/is not in the misinformation lter bubble), but as the interval measure reecting how deep

inside the bubble the user is. Such a measure is determined by the proportion of misinformative

content and calculated at dierent points in time. This denition and operationalization of lter

bubbles emphasizes our second contribution in so-far unexplored deeper insight into the dynamics

of lter bubbles since we calculated the proportion of misinformative content not only in the

selected time points but continuously over the whole duration of the study.

To prevent misinformation and misinformation lter bubbles, social media conduct various

countermeasures. These are usually reactions to public outcry or are required by legislation, e.g.,

EU’s Code of Practice on Disinformation [

]. Currently, the eectiveness of such countermeasures

is evaluated mainly by self-evaluation reports. This approach has been, however, already recognized

as insucient due to a lack of evidence and objectivity since social media are reluctant to provide

access to their data for independent research [

]. In addition, the commercial aims of social

media may be contradicting pro-social interests, as also revealed by the recent whistleblowing case

Facebook Files

[

]. The verication of countermeasures is further complicated by interference

of psychological factors. For example, some researchers argue that users’ intentional self-selected

personalization is more inuential than algorithms’ pre-selected personalization when it comes to

intellectual isolation [3,15].

An alternative solution towards responsible and governed AI in social media and eliminating

its negative social impact is employment of independent audits. Such audits, which are carried

out by an external auditor independent from the company developing the audited AI algorithm,

are envisaged also in the proposal of an upcoming EU legislation [

]. Nevertheless, the auditing

Facebook Files denote a leak of internal documents revealing that the company was aware of negative societal impact

caused by the platform, including its algorithms, like spreading and preferring harmful or controversial content. At the

same time, documents showed that the company’s reactions on such caused real-world harms were not sucient.

ACM Transactions on Recommender Systems (ACM TORS), Vol. 0, No. 0, Article 0. Publication date: 2022.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AuditingYouTube’sRecommendationAlgorithmforMisinformationFilterBubblesIVANSRBA,KempelenInstituteofIntelligentTechnologies,SlovakiaROBERTMORO,KempelenInstituteofIntelligentTechnologies,SlovakiaMATUSTOMLEIN,KempelenInstituteofIntelligentTechnologies,SlovakiaBRANISLAVPECHER∗,FacultyofInformationTechnol...

展开>> 收起<<

Auditing YouTubes Recommendation Algorithm for Misinformation Filter Bubbles IVAN SRBA Kempelen Institute of Intelligent Technologies Slovakia.pdf

共34页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Auditing YouTubes Recommendation Algorithm for Misinformation Filter Bubbles IVAN SRBA Kempelen Institute of Intelligent Technologies Slovakia

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: