tweet credibility and h-index score (i.e. retweets and likes).
Users with a higher score are not only considered to be more
influential but their tweets are also given greater credibility.
Our main contribution can be summarised as follows:
•First, we generated a dataset of 50,000 Twitter users.
For each user, we created a unique profile containing 19
features (discussed in Section III). Our dataset contained
only users whose tweets are public and who have friends
and followers.
•For each of the analysed users, we calculated their
Social Reputation score (Section III-B), an h-Index Score
(Section III-B), a Sentiment Score (Section III-B), Tweet
Credibility (Section III-B) and an Influence Score III-C.
•Furthermore, we classified each Twitter user account as
either trustworthy or untrustworthy. A trustworthy or
untrustworthy flag was assigned to each user based on
their social reputation, tweet credibility, the sentiment
score of a tweet and H-index score of re-tweets and likes,
as well as an influence score.
•To classify a large pool of unlabeled data, we used
an active learning model (a semi-supervised learning
algorithm) – a technique ideal for a situation in which
unlabeled data is abundant but manual labeling is expen-
sive [63], [67].
•We measured the performance of our model by using
the accuracy metric. This metric measures the percentage
of correctly predicted Twitter users (trustworthy and
untrustworthy).
We hope that this work will inspire others to perform further
research on this emerging problem while at the same time
kick-starting a period of greater trust on social media through
sustained collaboration between humans and machines.
B. Organisation
The rest of this paper is organised as follows: In Section II
related work is discussed followed by Section III in which we
discuss in detail our proposed approach. The active learning
approach and types of classifiers used are discussed in Sec-
tion IV. Section V features the experimental results and model
evaluation and presents the data collection and experimental
results of our model. Finally, in Section VI, we conclude the
paper.
II. RELATED WORK
Twitter is considered one of the top Online Social Networks
(OSNs) that provide a fertile environment for a variety of
research purposes. Compared to other popular OSNs, Twitter
gains significantly more attention in the research community
due to its open policy on data sharing and distinctive fea-
tures [4]. In 2011, the network had about 175 million unique
accounts [27], a figure that has grown to an estimated 1.3
billion3, making it one of the most popular social media
platforms.
3https://www.brandwatch.com/blog/twitter-stats-and-statistics/
Even though openness and vulnerability are two separate
issues, there have been many cases where malicious users
have taken advantage of Twitter’s openness and managed to
exploit the service in several ways (e.g. political Astroturfing,
spammers sending unsolicited messages, posting malicious
links, etc.).
Despite the important negative impact that the distribution
of fake news has on our society, only a handful of techniques
for identifying fake news on social media have been pro-
posed [4], [7], [9], [30], [31]. One of the most popular and
promising ideas is to evaluate Twitter users and assign them
a credit/reputation score.
Authors in [7] elaborated on the idea that posting duplicate
tweets should affect the reputation score of a user since this
is a behaviour that legitimate users typically do not engage in.
Therefore, posting the same tweet several times would have a
negative effect on the user’s overall credit score. The authors
calculated the edit distance to detect duplication between
two tweets posted from the same account. Furthermore, the
staggering quantities of exchanged messages and information
on Twitter have been exploited by users to hijack trending top-
ics [8]. This is a technique used to send unsolicited messages
to legitimate users. Additionally, there are Twitter accounts
whose only purpose is to artificially boost the popularity of a
hashtag with the main aim of increasing its popularity and
ultimately making the underlying topic a trend. One BBC
report mentioned that £150 was paid on Twitter users to
increase the popularity of a hashtag and make it a trend4.
To tackle these problems, researchers have used different
ways to assess the trustworthiness of tweets and assign an
overall rank to users [31]. Castillo et al. [35] measured the
credibility of tweets (news topics) based on Twitter features.
More precisely, an automated classification technique to detect
news from conversational topics was used. Alex Hai Wang [7]
used followers and friends parameters to calculate the reputa-
tion score, which further aided user classification (i.e. to detect
spammers). Additionally, Saito and Masuda [60] considered
these metrics while assigning a rank to Twitter users. In [36],
the authors analysed tweets relevant to the Mumbai attacks5.
Their analysis showed that most information providers were
unknown while the reputation of the others (based on number
of followers) was very low. In another study [37] that looked at
the same event, an information retrieval technique and machine
learning algorithm found that only 17% of the tweets related
to the underlying attacks were credible.
Gilani et al. [43] found that compared to normal users, bots
and fake accounts use a large number of external links in
their tweets. Hence, analysing other Twitter features such as
URLs is of paramount importance for correctly evaluating the
overall credibility of a user. While Twitter has built tools to
filter out such URLs, there are several masking techniques that
can effectively bypass Twitter’s safeguards.
4https://www.bbc.com/news/blogs-trending-43218939
5https://www.theguardian.com/world/blog/2011/jul/13/mumbai-blasts