1 Detecting Propagators of Disinformation on Twitter Using Quantitative Discursive Analysis Mark M. Bailey PhD1

2025-04-27 2 0 871.8KB 12 页 10玖币

侵权投诉

Detecting Propagators of Disinformation on Twitter Using

Quantitative Discursive Analysis

Mark M. Bailey, PhD1

1Cyber Intelligence and Data Science, Oettinger School of Science and Technology Intelligence,

National Intelligence University, Bethesda, MD

Disclaimer: All statements of fact, analysis, or opinion are the author’s and do not reflect the

official policy or position of the National Intelligence University, the Department of Defense or

any of its components, or the U.S. government.

Abstract

Efforts by foreign actors to influence public opinion have gained considerable attention because

of their potential to impact democratic elections. Thus, the ability to identify and counter

sources of disinformation is increasingly becoming a top priority for government entities in order

to protect the integrity of democratic processes. This study presents a method of identifying

Russian disinformation bots on Twitter using centering resonance analysis and Clauset-Newman-

Moore community detection. The data reflect a significant degree of discursive dissimilarity

between known Russian disinformation bots and a control set of Twitter users during the

timeframe of the 2016 U.S. Presidential Election. The data also demonstrate statistically

significant classification capabilities (MCC = 0.9070) based on community clustering. The

prediction algorithm is very effective at identifying true positives (bots), but is not able to resolve

true negatives (non-bots) because of the lack of discursive similarity between control users. This

leads to a highly sensitive means of identifying propagators of disinformation with a high degree

of discursive similarity on Twitter, with implications for limiting the spread of disinformation

that could impact democratic processes.

Background

Efforts by foreign actors to influence public opinion via the spread of disinformation have been

thrust into the spotlight recently because of their potential to disrupt democratic processes [1].

While the injection of propaganda and misleading information into public discourse by

adversaries is not a new phenomenon, the magnitude of its effects and reach can be greatly

enhanced by the connectedness of social media [2]. Thus, the ability to identify and counter

sources of disinformation is increasingly becoming a top priority for government entities in order

to protect election integrity.

Disinformation source detection and countermeasure development rely heavily on mathematical

representations of social interactions and textual information. There is a significant amount of

research in the area of social network analysis [3], and quantitative analysis of text [4], for user

classification and event detection. Additionally, several methods have already been developed

that can identify disinformation vectors, i.e., “bots,” using artificial intelligence [5], [6]. One

method of quantitative textual analysis – centering resonance analysis – was developed by

Corman, et al., and is a form of network text analysis that relies on elements of graph theory to

build mathematical representations of object associations within bodies of text [7].

Building on work by Corman, et al. [7], this study applies centering resonance analysis to a

body of aggregated tweets from a previously-identified set of Russian bots [8], as well as a set

of randomly selected tweets as a control set [9]. By representing aggregated tweet text for each

Twitter user as graphs of noun phrases, one can represent discursive similarity (“resonance”) as

the dot product of the centralities of common vertices, where each vertex represents a noun from

a noun phrase (discursive object), and edges represent connections between nouns (object

associations). By analyzing a connected graph of all users, where edges are defined by resonance

between vertices (i.e., discursive similarity between users), the Clauset-Newman-Moore

hierarchical agglomeration algorithm can be applied to identify discursive communities [10].

Because they exist for a singular purpose, Russian bots are likely to be very limited in the topics

they discuss on Twitter relative to the general population of Twitter users. Thus, it is

hypothesized that they will show greater discursive similarity with each other than with the

general Twitter population. This study will demonstrate this effect and will show that Russian

bots aggregate within distinct graph communities. Identified communities can be used to

develop recursive algorithms for disinformation propagator bot detection.

Methods

Software

Python version 2.7/3.1 was used in this analysis. The NetworkX, Natural Language Tool Kit,

and TextBlob Python libraries were used for network analysis and natural language processing,

respectively.

Approach

The method employed in this study leverages centering resonance analysis to quantify discursive

similarity between Twitter users – either known Russian bots, or randomly selected control users

[7]. A set of tweets from known Russian bots that were active during the 2016 Presidential

election, as well as a body of control tweets from random users, were acquired [8], [9]. Tweet

text was aggregated by user, and all concatenated text aggregates were preprocessed.

Preprocessing included punctuation removal, lemmatization, and case normalization. After

preprocessing, noun phrases were extracted from each corpus, and noun edges and vertices were

enumerated. Graphical representations of noun phrases were then generated for each user as a

discursive graph – a mathematical representation of the user’s discursive object associations.

Graph resonances – a measure of user interaction – were then calculated for each user pair, and

an association matrix was constructed. A graphical representation of the overall network, where

edges between users represent discursive similarity – was then constructed and optimized for

community detection. The entire process is outlined in Figure 1:

Figure 1: Process diagram.

In any graph, the betweenness centrality of a vertex (v) is defined as the sum of the fraction of

all-pairs shortest paths that pass through the vertex, and is given in Equation 1:





 (1)

In this equation, V is the set of vertices, σ(s,t) is the number of shortest (s,t) paths, and σ(s,t|v) is

the number of those paths passing through some vertex v other than s,t [11]. Betweenness

centralities were calculated for each user graph.

Word resonance – a measure of the discursive similarity between bodies of text – was calculated

as the dot product of the betweenness centrality vectors of the common set of vertices between

two user graphs (A and B), as follows [7]:

    (2)

To construct a standardized measure, the resonance is normalized:

 









(3)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1DetectingPropagatorsofDisinformationonTwitterUsingQuantitativeDiscursiveAnalysisMarkM.Bailey,PhD11CyberIntelligenceandDataScience,OettingerSchoolofScienceandTechnologyIntelligence,NationalIntelligenceUniversity,Bethesda,MDDisclaimer:Allstatementsoffact,analysis,oropinionaretheauthor’sanddonotreflec...

展开>> 收起<<

1 Detecting Propagators of Disinformation on Twitter Using Quantitative Discursive Analysis Mark M. Bailey PhD1.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Detecting Propagators of Disinformation on Twitter Using Quantitative Discursive Analysis Mark M. Bailey PhD1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: