1 Detecting Propagators of Disinformation on Twitter Using Quantitative Discursive Analysis Mark M. Bailey PhD1

2025-04-27 1 0 871.8KB 12 页 10玖币
侵权投诉
1
Detecting Propagators of Disinformation on Twitter Using
Quantitative Discursive Analysis
Mark M. Bailey, PhD1
1Cyber Intelligence and Data Science, Oettinger School of Science and Technology Intelligence,
National Intelligence University, Bethesda, MD
Disclaimer: All statements of fact, analysis, or opinion are the author’s and do not reflect the
official policy or position of the National Intelligence University, the Department of Defense or
any of its components, or the U.S. government.
Abstract
Efforts by foreign actors to influence public opinion have gained considerable attention because
of their potential to impact democratic elections. Thus, the ability to identify and counter
sources of disinformation is increasingly becoming a top priority for government entities in order
to protect the integrity of democratic processes. This study presents a method of identifying
Russian disinformation bots on Twitter using centering resonance analysis and Clauset-Newman-
Moore community detection. The data reflect a significant degree of discursive dissimilarity
between known Russian disinformation bots and a control set of Twitter users during the
timeframe of the 2016 U.S. Presidential Election. The data also demonstrate statistically
significant classification capabilities (MCC = 0.9070) based on community clustering. The
prediction algorithm is very effective at identifying true positives (bots), but is not able to resolve
true negatives (non-bots) because of the lack of discursive similarity between control users. This
leads to a highly sensitive means of identifying propagators of disinformation with a high degree
of discursive similarity on Twitter, with implications for limiting the spread of disinformation
that could impact democratic processes.
Background
Efforts by foreign actors to influence public opinion via the spread of disinformation have been
thrust into the spotlight recently because of their potential to disrupt democratic processes [1].
While the injection of propaganda and misleading information into public discourse by
adversaries is not a new phenomenon, the magnitude of its effects and reach can be greatly
enhanced by the connectedness of social media [2]. Thus, the ability to identify and counter
sources of disinformation is increasingly becoming a top priority for government entities in order
to protect election integrity.
2
Disinformation source detection and countermeasure development rely heavily on mathematical
representations of social interactions and textual information. There is a significant amount of
research in the area of social network analysis [3], and quantitative analysis of text [4], for user
classification and event detection. Additionally, several methods have already been developed
that can identify disinformation vectors, i.e., “bots,” using artificial intelligence [5], [6]. One
method of quantitative textual analysis centering resonance analysis was developed by
Corman, et al., and is a form of network text analysis that relies on elements of graph theory to
build mathematical representations of object associations within bodies of text [7].
Building on work by Corman, et al. [7], this study applies centering resonance analysis to a
body of aggregated tweets from a previously-identified set of Russian bots [8], as well as a set
of randomly selected tweets as a control set [9]. By representing aggregated tweet text for each
Twitter user as graphs of noun phrases, one can represent discursive similarity (“resonance”) as
the dot product of the centralities of common vertices, where each vertex represents a noun from
a noun phrase (discursive object), and edges represent connections between nouns (object
associations). By analyzing a connected graph of all users, where edges are defined by resonance
between vertices (i.e., discursive similarity between users), the Clauset-Newman-Moore
hierarchical agglomeration algorithm can be applied to identify discursive communities [10].
Because they exist for a singular purpose, Russian bots are likely to be very limited in the topics
they discuss on Twitter relative to the general population of Twitter users. Thus, it is
hypothesized that they will show greater discursive similarity with each other than with the
general Twitter population. This study will demonstrate this effect and will show that Russian
bots aggregate within distinct graph communities. Identified communities can be used to
develop recursive algorithms for disinformation propagator bot detection.
Methods
Software
Python version 2.7/3.1 was used in this analysis. The NetworkX, Natural Language Tool Kit,
and TextBlob Python libraries were used for network analysis and natural language processing,
respectively.
Approach
The method employed in this study leverages centering resonance analysis to quantify discursive
similarity between Twitter users either known Russian bots, or randomly selected control users
[7]. A set of tweets from known Russian bots that were active during the 2016 Presidential
election, as well as a body of control tweets from random users, were acquired [8], [9]. Tweet
text was aggregated by user, and all concatenated text aggregates were preprocessed.
Preprocessing included punctuation removal, lemmatization, and case normalization. After
preprocessing, noun phrases were extracted from each corpus, and noun edges and vertices were
enumerated. Graphical representations of noun phrases were then generated for each user as a
discursive graph a mathematical representation of the user’s discursive object associations.
3
Graph resonances a measure of user interaction were then calculated for each user pair, and
an association matrix was constructed. A graphical representation of the overall network, where
edges between users represent discursive similarity was then constructed and optimized for
community detection. The entire process is outlined in Figure 1:
Figure 1: Process diagram.
In any graph, the betweenness centrality of a vertex (v) is defined as the sum of the fraction of
all-pairs shortest paths that pass through the vertex, and is given in Equation 1:


 (1)
In this equation, V is the set of vertices, σ(s,t) is the number of shortest (s,t) paths, and σ(s,t|v) is
the number of those paths passing through some vertex v other than s,t [11]. Betweenness
centralities were calculated for each user graph.
Word resonance a measure of the discursive similarity between bodies of text was calculated
as the dot product of the betweenness centrality vectors of the common set of vertices between
two user graphs (A and B), as follows [7]:
   (2)
To construct a standardized measure, the resonance is normalized:
 

(3)
摘要:

1DetectingPropagatorsofDisinformationonTwitterUsingQuantitativeDiscursiveAnalysisMarkM.Bailey,PhD11CyberIntelligenceandDataScience,OettingerSchoolofScienceandTechnologyIntelligence,NationalIntelligenceUniversity,Bethesda,MDDisclaimer:Allstatementsoffact,analysis,oropinionaretheauthor’sanddonotreflec...

收起<<
1 Detecting Propagators of Disinformation on Twitter Using Quantitative Discursive Analysis Mark M. Bailey PhD1.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:871.8KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注