Trust and Believe Should We Evaluating the Trustworthiness of Twitter Users 1stTanveer Khan

2025-05-06 0 0 979.02KB 10 页 10玖币

侵权投诉

Trust and Believe – Should We?

Evaluating the Trustworthiness of Twitter Users

1st Tanveer Khan

Network and Information Security Group

Tampere University

Tampere, Finland

tanveer.khan@tuni.ﬁ

2nd Antonis Michalas

Network and Information Security Group

Tampere University

Tampere, Finland

antonios.michalas@tuni.ﬁ

Abstract—Social networking and micro-blogging services, such

as Twitter, play an important role in sharing digital information.

Despite the popularity and usefulness of social media, they

are regularly abused by corrupt users. One of these nefarious

activities is so-called fake news – a “virus” that has been

spreading rapidly thanks to the hospitable environment provided

by social media platforms. The extensive spread of fake news

is now becoming a major problem with far-reaching negative

repercussions on both individuals and society. Hence, the iden-

tiﬁcation of fake news on social media is a problem of utmost

importance that has attracted the interest not only of the research

community but most of the big players on both sides - such

as Facebook, on the industry side, and political parties on the

societal one. In this work, we create a model through which

we hope to be able to offer a solution that will instill trust in

social network communities. Our model analyses the behaviour

of 50,000 politicians on Twitter and assigns an inﬂuence score

for each evaluated user based on several collected and analysed

features and attributes. Next, we classify political Twitter users

as either trustworthy or untrustworthy using random forest and

support vector machine classiﬁers. An active learning model has

been used to classify any unlabeled ambiguous records from our

dataset. Finally, to measure the performance of the proposed

model, we used accuracy as the main evaluation metric.

Index Terms—Credibility, Fake News, Inﬂuence Score, Senti-

ment Analysis, Trust, Twitter, Active Learning

I. INTRODUCTION

With one-third of the world’s population using some form

of social media [61], it is evident that the popularity of social

networking sites has rapidly increased in recent years. This has

signiﬁcantly changed the dynamics of communication across

all age groups; the way we work, the way we live, the way we

interact with other people and the way we share information

have already changed drastically. Furthermore, social media

enables sharing of important information with many people

simultaneously, allowing users to reach a bigger audience.

While social media has its positive sides, it is also important

to consider the ﬂip side and properly evaluate its negative

impacts. One of the latest negative effects of social media is

the so-called fake news phenomenon. It has been proven that

the massive distribution of fake news plays an important role

in the success or failure of important events and causes [10],

This research has received funding from the EU research projects ASCLE-

PIOS (No. 826093) and CYBELE (No 825355).

[11]. Apart from the dissemination and circulation of false

information, social networks provide the ideal toolkit for

corrupt users to perform a wide range of illegitimate actions

such as spamming and political Astroturﬁng [7], [9].

Twitter, with around half a billion users, is one of the

three most popular social media platforms. It generates on

average 10,000 tweets per second (approximately 500 million

tweets per day1) [47]. It is considered a valuable resource

for government agencies, businesses, political parties, ﬁnancial

institutions, fundraising, and many other actors as it enables

uncomplicated extraction and dissemination of important in-

formation.

A recent study [1] examined 10 million tweets generated by

700,000 different Twitter accounts and linked to 600 fake and

conspiracy news sites. It identiﬁed clusters of Twitter accounts

that linked back to these sites repeatedly, often in ways that

seemed coordinated or even automated. In another study, it was

found that 6.6 million tweets with fake news were distributed

before the 2016 US elections. Different social and political

events such as the 2016 US presidential election [15] were

tainted by a growing number of fake news.

Global concern about the impact of fake news on our

societies is on the rise. Hence, there is an immediate need

for the design, implementation, and adoption of new systems

and algorithms that are able to identify and differentiate

between fake and real news. However, with the increase in

the number of social media users2, the quantity of generated

content is increasing rapidly, which hinders the identiﬁcation

of fabricated stories [16] and prevents the identiﬁcation of a

signiﬁcant amount of information that can potentially give rise

to false rumours. Therefore, verifying the credibility of a tweet

or assigning a score to users based on the information they

have been sharing is a problem that has caught the interest

of many academic and industrial researchers [17], [18], [20]–

[25].

A. Our Contribution

In this work, we present a model for analysing Twitter users

that assigns a score calculated based on their social proﬁles,

1https://www.omnicoreagency.com/twitter-statistics/

2In 2018, an estimated 2.65 billion people were using social media

worldwide, a number projected to increase to almost 3.1 billion in 2021 [61].

arXiv:2210.15214v1 [cs.SI] 27 Oct 2022

tweet credibility and h-index score (i.e. retweets and likes).

Users with a higher score are not only considered to be more

inﬂuential but their tweets are also given greater credibility.

Our main contribution can be summarised as follows:

•First, we generated a dataset of 50,000 Twitter users.

For each user, we created a unique proﬁle containing 19

features (discussed in Section III). Our dataset contained

only users whose tweets are public and who have friends

and followers.

•For each of the analysed users, we calculated their

Social Reputation score (Section III-B), an h-Index Score

(Section III-B), a Sentiment Score (Section III-B), Tweet

Credibility (Section III-B) and an Inﬂuence Score III-C.

•Furthermore, we classiﬁed each Twitter user account as

either trustworthy or untrustworthy. A trustworthy or

untrustworthy ﬂag was assigned to each user based on

their social reputation, tweet credibility, the sentiment

score of a tweet and H-index score of re-tweets and likes,

as well as an inﬂuence score.

•To classify a large pool of unlabeled data, we used

an active learning model (a semi-supervised learning

algorithm) – a technique ideal for a situation in which

unlabeled data is abundant but manual labeling is expen-

sive [63], [67].

•We measured the performance of our model by using

the accuracy metric. This metric measures the percentage

of correctly predicted Twitter users (trustworthy and

untrustworthy).

We hope that this work will inspire others to perform further

research on this emerging problem while at the same time

kick-starting a period of greater trust on social media through

sustained collaboration between humans and machines.

B. Organisation

The rest of this paper is organised as follows: In Section II

related work is discussed followed by Section III in which we

discuss in detail our proposed approach. The active learning

approach and types of classiﬁers used are discussed in Sec-

tion IV. Section V features the experimental results and model

evaluation and presents the data collection and experimental

results of our model. Finally, in Section VI, we conclude the

paper.

II. RELATED WORK

Twitter is considered one of the top Online Social Networks

(OSNs) that provide a fertile environment for a variety of

research purposes. Compared to other popular OSNs, Twitter

gains signiﬁcantly more attention in the research community

due to its open policy on data sharing and distinctive fea-

tures [4]. In 2011, the network had about 175 million unique

accounts [27], a ﬁgure that has grown to an estimated 1.3

billion3, making it one of the most popular social media

platforms.

3https://www.brandwatch.com/blog/twitter-stats-and-statistics/

Even though openness and vulnerability are two separate

issues, there have been many cases where malicious users

have taken advantage of Twitter’s openness and managed to

exploit the service in several ways (e.g. political Astroturﬁng,

spammers sending unsolicited messages, posting malicious

links, etc.).

Despite the important negative impact that the distribution

of fake news has on our society, only a handful of techniques

for identifying fake news on social media have been pro-

posed [4], [7], [9], [30], [31]. One of the most popular and

promising ideas is to evaluate Twitter users and assign them

a credit/reputation score.

Authors in [7] elaborated on the idea that posting duplicate

tweets should affect the reputation score of a user since this

is a behaviour that legitimate users typically do not engage in.

Therefore, posting the same tweet several times would have a

negative effect on the user’s overall credit score. The authors

calculated the edit distance to detect duplication between

two tweets posted from the same account. Furthermore, the

staggering quantities of exchanged messages and information

on Twitter have been exploited by users to hijack trending top-

ics [8]. This is a technique used to send unsolicited messages

to legitimate users. Additionally, there are Twitter accounts

whose only purpose is to artiﬁcially boost the popularity of a

hashtag with the main aim of increasing its popularity and

ultimately making the underlying topic a trend. One BBC

report mentioned that £150 was paid on Twitter users to

increase the popularity of a hashtag and make it a trend4.

To tackle these problems, researchers have used different

ways to assess the trustworthiness of tweets and assign an

overall rank to users [31]. Castillo et al. [35] measured the

credibility of tweets (news topics) based on Twitter features.

More precisely, an automated classiﬁcation technique to detect

news from conversational topics was used. Alex Hai Wang [7]

used followers and friends parameters to calculate the reputa-

tion score, which further aided user classiﬁcation (i.e. to detect

spammers). Additionally, Saito and Masuda [60] considered

these metrics while assigning a rank to Twitter users. In [36],

the authors analysed tweets relevant to the Mumbai attacks5.

Their analysis showed that most information providers were

unknown while the reputation of the others (based on number

of followers) was very low. In another study [37] that looked at

the same event, an information retrieval technique and machine

learning algorithm found that only 17% of the tweets related

to the underlying attacks were credible.

Gilani et al. [43] found that compared to normal users, bots

and fake accounts use a large number of external links in

their tweets. Hence, analysing other Twitter features such as

URLs is of paramount importance for correctly evaluating the

overall credibility of a user. While Twitter has built tools to

ﬁlter out such URLs, there are several masking techniques that

can effectively bypass Twitter’s safeguards.

4https://www.bbc.com/news/blogs-trending-43218939

5https://www.theguardian.com/world/blog/2011/jul/13/mumbai-blasts

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TrustandBelieveShouldWe?EvaluatingtheTrustworthinessofTwitterUsers1stTanveerKhanNetworkandInformationSecurityGroupTampereUniversityTampere,Finlandtanveer.khan@tuni.2ndAntonisMichalasNetworkandInformationSecurityGroupTampereUniversityTampere,Finlandantonios.michalas@tuni.AbstractSocialnetworkinga...

展开>> 收起<<

Trust and Believe Should We Evaluating the Trustworthiness of Twitter Users 1stTanveer Khan.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Trust and Believe Should We Evaluating the Trustworthiness of Twitter Users 1stTanveer Khan

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: