A Special Operation A Quantitative Approach to Dissecting and Comparing Different Media Ecosystems Coverage of the Russo-Ukrainian War Hans W.A. Hanley Deepak Kumar Zakir Durumeric

2025-04-27 0 0 599.81KB 12 页 10玖币
侵权投诉
“A Special Operation”: A Quantitative Approach to Dissecting and Comparing
Different Media Ecosystems’ Coverage of the Russo-Ukrainian War
Hans W.A. Hanley, Deepak Kumar, Zakir Durumeric
Stanford University
hhanley@stanford.edu, kumarde@stanford.edu, zakird@stanford.edu
Abstract
The coverage of the Russian invasion of Ukraine has var-
ied widely between Western, Russian, and Chinese media
ecosystems with propaganda, disinformation, and narrative
spins present in all three. By utilizing the normalized point-
wise mutual information metric, differential sentiment analy-
sis, word2vec models, and partially labeled Dirichlet alloca-
tion, we present a quantitative analysis of the differences in
coverage amongst these three news ecosystems. We find that
while the Western press outlets have focused on the military
and humanitarian aspects of the war, Russian media have fo-
cused on the purported justifications for the “special military
operation” such as the presence in Ukraine of “bio-weapons”
and “neo-nazis”, and Chinese news media have concentrated
on the conflict’s diplomatic and economic consequences. De-
tecting the presence of several Russian disinformation nar-
ratives in the articles of several Chinese media outlets, we
finally measure the degree to which Russian media has influ-
enced Chinese coverage across Chinese outlets’ news articles,
Weibo accounts, and Twitter accounts. Our analysis indicates
that since the Russian invasion of Ukraine, Chinese state me-
dia outlets have increasingly cited Russian outlets as news
sources and spread Russian disinformation narratives.
1 Introduction
On February 24, 2022, Russian Federation President
Vladimir Putin announced a “special military operation to
demilitarize and denazify” Ukraine (Thompson and Myers
2022). Following the initial invasion, media outlets in var-
ious parts of the world covered the war in drastically dif-
ferent lights. For instance, the Western press (e.g., CNN,
Fox News, New York Times) labeled the “special operation”
a “war crime” laden “unprovoked invasion” perpetrated by
the Russian government (Phillip 2022). Russian outlets (e.g.,
Sputnik News, Russia Today), in turn, have largely denied
any war crimes, placing fault for the necessity of the “spe-
cial operation” on Western countries (Thompson and My-
ers 2022). Chinese state media outlets (e.g., China Today,
People’s Daily) meanwhile have advocated for diplomacy
while simultaneously blaming Western powers for sparking
the conflict. However, despite the evident differences and the
large impact these differences have had on different pop-
Copyright © 2023, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
ulations’ perceptions of the war, there has not been a sys-
tematic analysis of the narratives present in these three me-
dia ecosystems(McCarthy and Xiong 2022). In this work,
we present one of the first quantitative analyses of the key
differences and similarities between the narratives touted
by Chinese, Russian, and Western outlets about the Russo-
Ukrainian War. Specifically, we propose an approach to de-
termine and compare which topics media ecosystems report
on, how each ecosystem reported on those topics, and fi-
nally whether there was influence between different media
ecosystems (e.g., Russian to Chinese).
To perform our analysis, we first curate a dataset of 11,359
different articles about Ukraine from Western (4,536 arti-
cles from eight outlets), Russian (3,572 articles from ten
outlets), and Chinese (3,251 articles from seven outlets)
news ecosystems. Utilizing differential sentiment analysis
and word2vec models, we then detail how each ecosys-
tem has largely covered aspects of the Russo-Ukrainian
War. Using a normalized scaled pointwise mutual informa-
tion metric (NPMI) and partially labeled Dirichlet alloca-
tion (PLDA), we then extract the most characteristic words
and topics from each ecosystem’s articles. We quantitatively
show that while Western outlets have persistently labeled the
“invasion” as a “war” and described the “crimes” committed
throughout Ukraine, both Russian and Chinese news out-
lets have largely characterized the invasion as a “crisis” or a
“conflict.” Similarly, while the Western press has focused on
the humanitarian and day-to-day military aspects of the war,
Russian outlets have focused on justifications for the “spe-
cial military operations” like the presence of “bio-weapons”
and “neo-nazis” in Ukraine and Chinese news outlets have
concentrated on the diplomatic and economic fallout of the
invasion.
After performing our comparative topic analysis, we iden-
tified the repeated presence of several Russian disinfor-
mation narratives (Price 2022) within the Chinese news
ecosystem, particularly about US-funded Ukrainian biologi-
cal weapons facilities. We thus measure the degree to which
Russian news outlets have influenced Chinese news out-
lets’ coverage. Specifically, we document the frequency that
seven different Chinese outlets use Russian news outlets as
sources and their reuse of Russian-sourced images within
their coverage of the war on their websites, Twitter accounts,
and Weibo accounts (a Chinese version of Twitter). Observ-
arXiv:2210.03016v4 [cs.CY] 31 May 2023
Western Russian Chinese
Domain Articles Domain Articles Domain Articles
cnn.com 477 tass.com 861 chinadaily.com.cn 1044
nytimes.com 571 sputniknews.com 743 cgtn.com 966
washingtonpost.com 414 news-front.info 508 globaltimes.cn 367
theguardian.com 636 geopolitica.ru 80 ecns.cn 702
yahoo.com 774 southfront.org 241 xinhuanet.com 430
reuters.com 716 katehon.com 65 pdnews.cn 286
foxnews.com 566 journal-neo.org 90 english.cctv.com 107
nbcnews.com 382 rt.com 689 –
– strategic-culture.org 102 –
– waronfakes.com 193 –
Total 4536 3572 3251
Table 1: We gather a set of English-language articles about Ukraine from Western, Russian, and Chinese media ecosystems.
ing a marked increase in Chinese state media citations of
Russian sources beginning in early February 2022, we fi-
nally measure how an extended group of 39 Chinese me-
dia outlets interacted with and promoted Russian disinfor-
mation narratives on Weibo. Looking at the popularity of
Chinese news outlets’ posts about Russian disinformation
topics on Weibo, we find that these posts enjoyed higher
levels of popularity compared to posts that do not reference
these disinformation stories. Finally, fitting a negative bino-
mial regression to model the number of Weibo posts from
Chinese news outlets about different Russian disinformation
campaigns, we find that as Chinese news outlets cite more
Russian outlets as news sources, they are more likely to post
disinformation.
Our work underscores the importance of performing anal-
yses across multiple platforms and media ecosystems in un-
derstanding the nuances of how global events are framed,
how different populations interpret and digest world events,
and how disinformation originates and spreads. The Russo-
Ukrainian War is a global event with global implications ev-
ery country must consider ranging from skyrocketing inter-
national food prices, the resettlement of refugees, and threats
of nuclear fallout (Treisman 2022; Thompson and Myers
2022); focusing only on news and campaigns targeted at
Western to understand how populations are processing these
implications can only go so far. We hope our quantitative
approach can serve as the basis for future studies.
2 Methodology
To perform our comparative analysis of the attitudes, nar-
ratives, and topics discussed by the Western press, Chinese
state media, and Russian propaganda websites, we collect a
total of 11,359 unique news articles published between Jan-
uary 1, 2022, and April 15, 2022 (Table 1). To later under-
stand the degree of Russian influence on Chinese media, we
collect the social media feeds of major Chinese state media
outlets and Russian state actors on both Weibo and Twitter.
News Articles. Our news article dataset consists of
published pieces from Western news websites, English-
language Russian websites, and English-language Chinese
websites (Table 1). For lack of a better term, we use the term
“Western” to describe press widely circulated in the global
“West” (e.g., US, UK) (Wes 2022). We refer to websites as
“Russian” if they are Russian state media, were identified as
“proxies” for the Russian government, or are Russian propa-
ganda (Rus 2020). Lastly, we refer to websites as “Chinese”
if they are Chinese state media outlets.
For our list of Western outlets, we manually selected eight
highly popular mainstream news websites from across the
political spectrum (Zannettou et al. 2017). In addition to a
set of nine Russian websites identified by the US State De-
partment (Rus 2020), for our Russian dataset, we include
the recently launched waronfakes.com. Since its initial ap-
pearance in March 2022, the New York Times and others
have investigated the site as a hub of Russian disinforma-
tion (Thompson and Myers 2022; Hanley, Kumar, and Du-
rumeric 2023). For our list of Chinese media news web-
sites, we utilize seven English-language news websites iden-
tified by the US State Department as Chinese “foreign mis-
sions” (Ortagus 2020). We recognize that these lists do not
incorporate all articles circulated in each media ecosystem
and thus are naturally biased. However, our selection of web-
sites do represent a cross-section of some of the most widely
circulated news sources in each ecosystem and thus pro-
vide indications of reporting for Western (You 2022), Rus-
sian (Rus 2020), and Chinese (Ortagus 2020) news media.
We utilize a breadth-first scraping algorithm and the
Python Selenium package to collect the set of English-
language articles that each of our websites published about
Ukraine. Specifically, for each website, we scrape 5 hops
from the root page (i.e., we collect all URLs linked from the
homepage [1st hop], then all URLs linked from those pages
[2nd hop], and so forth). To get Ukraine-related articles, for
each website page, we use the Python newspaker3k li-
brary to collect article contents and to determine if the ar-
ticle mentions “Ukraine”. We further supplement this cor-
pus by using Google’s API to find and add articles in-
dexed in 2022 that mention Ukraine. We note that due to
the lack of precision in acquiring the publication date of
each article with newspaker3k, we utilize the Python li-
brary htmldate to extract each article’s publish date. Al-
together, between January 1, 2022, and April 15, 2022, we
collect 11,359 articles about Ukraine; 4,536 from Western
Press outlets, 3,572 from Russian propaganda websites, and
3,251 from Chinese state media (Table 1).
Weibo Dataset. To understand the degree of Russian influ-
ence on Chinese media reports and discussions surrounding
the Russo-Ukrainian War, we also collect posts from Weibo,
a Chinese Mandarin-language version of Twitter (McCarthy
and Xiong 2022). We collect the posts of the accounts of
the seven different Chinese state media organizations from
our news article dataset (for the CGTN news organization,
we collect the Weibo posts of its CGTN and CGTN journal-
ist group/CGTNaccounts). To help quantify the con-
nection of each of these media organizations to the Russian
government and Russian state media, we further scrape the
accounts of the Russian Embassy/@使,
Russia Today/@RT, and Sputnik News/@
. Lastly, in addition to our Chinese news
organizations’ Weibo accounts and Russian state-sponsored
Weibo accounts, we collect the Weibo posts of the 200 users
who most prominently discussed the Russo-Ukrainian con-
flict at the end of February as labeled by Fung et al. (Fung
and Ji 2022). This list was manually created from users who
“actively posted about and ranked among the top posts of
trending hashtags related to the Russo-Ukrainian war.” Af-
ter combining our lists of Weibo users, and removing in-
active and duplicate accounts, we had a total of 191 dis-
tinct accounts. For each account in our dataset, we scraped
the account on four occasions (March 14, March 28, April
06, and April 16) to ensure our dataset was comprehen-
sive. To scrape each Weibo account, we utilize the Python
weibo-scraper tool.1Ultimately, our dataset consists
of 191 different accounts and 343,435 distinct Weibo posts
from between January 1 and April 15, 2022.
Twitter Dataset. In addition to our Weibo dataset, we fur-
ther collect the tweets of the seven different news Chinese
news outlets within our news article dataset (China Daily,
CGTN, Global Times, Chinese News Service, Xin Hua, Peo-
ple’s Daily, and CCTV). Unlike for our Weibo dataset, we do
not collect the set of Chinese users who most prominently
discussed the Russo-Ukrainian conflict on Twitter (Twitter
has been banned in China since 2009 (Barry 2022)), limit-
ing our Twitter analysis to these seven major state-sponsored
Chinese outlets who also regularly tweet. To investigate
these accounts’ connection to the Russian government and
Russian news media, we again collect the tweets of the Rus-
sianEmbassy/@RussianEmbassy, Russia Today/@RT com,
and Sputnik News/@SputnikInt. We collect the tweets of
each account using the Tweepy API (Roesslein 2009) on
four different instances (March 06, March 13, April 02, and
April 16). Ultimately, our Twitter dataset consists of 62,717
unique tweets from 10 different accounts from January 1 and
April 15, 2022.
Pointwise Mutual Information. To determine different
news ecosystems’ associations with distinct words, we uti-
lize the normalized pointwise mutual information metric.
Pointwise mutual information (PMI) is an information-
theoretic measure for discovering associations amongst
words (Bouma 2009). However, as in Kessler et al., rather
than finding the pointwise mutual information between dif-
ferent words, we utilize this measure to understand words’
association with different categories (Kessler 2017). In this
way, we seek to identify the characteristic words of each
ecosystem’s coverage of the Russo-Ukrainian War (i.e.,
1https://github.com/Xarrow/weibo-scraper
Western, Chinese, and Russian media). We utilize the nor-
malized and scaled version of PMI to prevent our metric
from being biased towards rarely occurring words and to in-
crease interpretability. Scaled normalized PMI (NMPI) for a
wordiand each category Cjis calculated as follows:
P MI(wordi, Cj) = log2
P(wordi, Cj)
P(wordi)P(ci)
NP MI(wordi, Cj) = P MI(wordi, Cj)
log2(P(wordi, Cj))
where Pis the probability of occurrence and a scaling
parameter αis added to the counts of each word. NPMI
ranges between (-1,1). We choose α= 50 given the size
of our dataset (Turney 2001). An NPMI value of 1rep-
resents that the word and the category never occur together
(given that we utilize the scaled version this never occurs),
0 represents independence, and +1 represents perfect co-
occurrence (Bouma 2009). Finally, before computing NMPI
on our dataset, we first lemmatize and remove stop words as
in prior work (Zannettou et al. 2020).
Partially Labelled Latent Dirichlet Allocation. In addition
to identifying words characteristic of each news ecosystem,
we also extract the set of topics that are distinctive to each
ecosystem. To do this, we utilize Partially Labelled Dirichlet
Allocation (PLDA). PLDA is an extension of the widely-
used topic analysis algorithm Latent Dirichlet Allocation
(LDA) (Ramage, Manning, and Dumais 2011). PLDA, like
LDA, assumes that each document is composed of a distri-
bution of different topics (which themselves are composed
as a distribution of different words). However, unlike LDA,
each document can form topics from a pool associated with
one or more of its specific labels. For example, a newspa-
per article from nytimes.com, which is labeled as “Western”,
can draw from a set of labeled topics associated with “West-
ern” (as opposed to an article from chinadaily.com.cn which
can draw from a set of labeled topics associated with “Chi-
nese”). In addition to drawing from the distribution of top-
ics associated with its labels, documents also further draw
from a pool of latent topics that are associated with every
document in the dataset. PLDA can thus model the topics
that are common to every document while also identifying
discriminating topics for each label (i.e., topics specific to
“Western”, “Russian”, “Chinese”).
Again when fitting our PLDA model, we first lemmatize
and remove stop words. When computing topics, we further
weight words using term-frequency inverse document fre-
quency (TF-IDF). Previous work has shown that this weight-
ing leads to more accurate topics (Zannettou et al. 2020).
To find the appropriate amount of topics, we optimize the
word2vec topic coherence score cvthat measures the se-
mantic similarity among extracted topic words (Zannettou
et al. 2020). We utilize a baseline number of 300 latent top-
ics, varying the number of topics per label from 1 to 20. We
achieve the best coherence score of 0.46 with 15 topics as-
sociated with each label (345 total topics).
摘要:

“ASpecialOperation”:AQuantitativeApproachtoDissectingandComparingDifferentMediaEcosystems’CoverageoftheRusso-UkrainianWarHansW.A.Hanley,DeepakKumar,ZakirDurumericStanfordUniversityhhanley@stanford.edu,kumarde@stanford.edu,zakird@stanford.eduAbstractThecoverageoftheRussianinvasionofUkrainehasvar-iedw...

展开>> 收起<<
A Special Operation A Quantitative Approach to Dissecting and Comparing Different Media Ecosystems Coverage of the Russo-Ukrainian War Hans W.A. Hanley Deepak Kumar Zakir Durumeric.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:599.81KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注