FollowerFollowee Ratio Category and User Vector for Analyzing Following Behavior Hayato Oshimo Shiori Hironaka Mitsuo Yoshiday and Kyoji Umemura

2025-05-06 0 0 386.41KB 6 页 10玖币

侵权投诉

Follower–Followee Ratio Category and User Vector

for Analyzing Following Behavior

Hayato Oshimo∗, Shiori Hironaka∗, Mitsuo Yoshida†, and Kyoji Umemura∗

∗Department of Computer Science and Engineering

Toyohashi University of Technology

Aichi, Japan

Email: oshimo.hayato.zk@tut.jp, hironaka.shiori.ru@tut.jp, umemura@tut.jp

†Faculty of Business Sciences

University of Tsukuba

Tokyo, Japan

Email: mitsuo@gssm.otsuka.tsukuba.ac.jp

Abstract—Analyzing following behavior is important in many

applications. Following behavior may depend on the main inten-

tion of the follower. Users may either follow their friends or they

may follow celebrities to know more about them. It is difﬁcult

to estimate users’ intention from their following relationships.

In this paper, we propose an approach to analyze following

relationships. First, we investigated the similarity between users.

Similar followers and followees are likely to be friends. However,

when the follower and followee are not similar, it is likely

that follower seeks to obtain more information on the followee.

Second, we categorized users by the network structure. We

then proposed analysis of following behavior based on similarity

and category of users estimated from tweets and user data.

We conﬁrmed the feasibility of the proposed method through

experiments. Finally, we examined users in different categories

and analyzed their following behavior.

Keywords—User analysis, User embeddings, Network science,

Twitter, Following behavior

I. INTRODUCTION

Twitter is a social media platform where people post short

messages called tweets and communicate with each other.

Twitter users follow other users by subscribing to their tweets.

Twitter users can follow without the permission of the targeted

user; thus, the following relationship is directed.

Analyzing following behavior is important in many appli-

cations, such as friend recommendations [1] or information

diffusion analysis [2]. Users’ following behavior depends on

their intention. There are various intentions on the following

links [3]. It is difﬁcult to classify these links because it is hard

to collect data indicating the intentions of the links.

We assume that different categories of users have different

preferences for whom to follow. For example, users that

are willing to learn more about celebrities follow them. We

classiﬁed users by the follower–followee ratio, which is the

ratio of the number of followees to the number of followers.

The follower–followee ratio has been used to analyze social

media users [4].

We analyzed the preferences of users followers based on the

user category and topical similarity. Topical similarity reﬂects

the similarity between the users’ tweets. The user category

was deﬁned using the follower–followee ratio, which reﬂects

the user’s characteristics. First, we conﬁrmed the feasibility

of the computed topical similarity. Then, we conﬁrmed the

feasibility of the category using topical similarity. We found

that the following behavior described based on the topical

similarity between the follower and followee provided a rea-

sonable explanation for the following relation among users

in different categories. This suggests that both category and

topical similarity are useful for analyzing following behavior.

II. RELATED WORK

A. User Categories on Social Media

Java et al. [5] considered that Twitter users can mainly be

categorized as Information Source, Friends, and Information

Seeker, based on their link structure. Yan et al. [4] used

the follower–followee ratio to determine user characteristics

on ResearchGate, a social media platform for scientists and

researchers. ResearchGate users can share their research pa-

pers and follow other researchers. Yan et al. adopted the user

categories proposed by Java et al. and classiﬁed users based

on the follower–followee ratio.

Other researchers have performed classiﬁcation without

using link structures, using other methods such as classifying

users into ﬁve types based on social theory [6] and estimating

Big Five personalities from user proﬁles [7]. These classi-

ﬁcations require training data collected through surveys or

crowdsourcing.

We classiﬁed Twitter users into four categories based on

their follower–followee ratio. The categories of Information

Source and Information Seeker were the same as in previous

studies [4], [5]. In addition, we divided the Friends category

into two groups, according to whether the follower–followee

ratio was greater than 1. We assumed that more general users

would have a smaller number of followers than followees and

would exhabit different characteristics.

B. Purpose and Intention of Following

Following behavior depends on the purpose of the follow-

ing, which relates to edge types. Barbieri et al. [1] proposed

arXiv:2210.13874v1 [cs.SI] 25 Oct 2022

a user recommendation method based on whether the edge is

topical or social. Komori et al. [8] analyzed following rela-

tionships by classifying them into virtual and real friendships.

Takemura et al. [3] classiﬁed following relationships into

eight types, combining three axes: user-orientation, content-

orientation, and mutuality. These researchers collected data

for each following relationships by using surveys to build a

classiﬁcation model. However, it is difﬁcult to collect training

data on individual following relationships automatically. Ya-

maguchi et al. [9] proposed a method to explain the reason

for following through coupled tensor analysis using tagging

action (add users to the lists). However, only a few users use

the list feature on Twitter.

We consider that different categories of users tend to have

different main purposes for following. We simply classiﬁed

user categories by follower–followee ratio and analyzed the

following behavior according to user category.

C. Homophily on Social Media

Homophily is a phenomenon where users tend to be friends

with similar people [10]. Various types of homophily have

been observed on the online social graph [11]–[13]. In a

previous study [14], topical homophily was reported based on

the user’s topics of interest recognized from tweets using latent

dirichlet allocation (LDA) and the following relationship, and

the authors concluded that the topics of users with following

relationships are similar. We also focused on the topical

homophily of the users’ tweet content.

Homophily relates to network structure. The follower–

followee ratio and homophily of various attributes have been

investigated [15]. Homophily is an important assumption in

network-based user attribute estimation. Hironaka et al. [16]

examined the relationship between the follower–followee ratio

and location homophily using home location estimation. Based

on the data of the countries that are the top-10 users of Twitter,

they reported that the follower–followee ratio contributes to

the estimation performance. In this study, we examined the

relationship between topical homophily and follower–followee

ratio.

III. DATA COLLECTION

First, we randomly extracted users for analysis using Twitter

API. Then, we collected data on their followees and followers.

In addition, we collected their tweets to calculate topical

homophily.

We collected English tweets from July 11 to July 17, 2021,

using Twitter Streaming API1. We randomly selected 50,000

unique users who tweeted at least once in this period.

Next, we collected followees and followers data using API2.

We also collected the latest 3200 tweets using API3. If a user

1https://developer.twitter.com/en/docs/twitter-api/v1/tweets/ﬁlter-realtime/

api-reference/post-statuses-ﬁlter (viewed 2022-06-10)

2https://developer.twitter.com/en/docs/twitter-api/v1/accounts-and-users/

follow-search-get-users/api-reference/get-followers-ids and https:

//developer.twitter.com/en/docs/twitter-api/v1/accounts-and-users/

follow-search-get-users/api-reference/get-friends-ids (viewed 2022-06-10)

3https://developer.twitter.com/en/docs/twitter-api/v1/tweets/timelines/

api-reference/get-statuses-user timeline (viewed 2022-06-10)

Generate user vectors

from tweets Categorize users

by follower-

followee ratio

Investigate topical

homophily

Investigate preferred user category

for each user category

Investigate similarities of topics for

each user category

Fig. 1. Research workﬂow

had posted less than 3200 tweets, we collected as many as

possible. As a result, 48,881 user timelines were collected.

In the analysis, we used the data of 48,829 users, that is, the

users whose tweet and follower data that we could successfully

collect. We detected 59,778 following relationships among

them.

IV. USER CLASSIFICATION AND USER VECTOR

In this study, we analyzed users’ following behavior based

on user category and topical homophily. We analyzed the

following workﬂow showed in Figure 1.

First, we explain the follower–followee ratio to classify

users and then describe the classiﬁcation method. Second, we

deﬁne the user vector for calculating the topical homophily and

then describe the calculation method of topical homophily.

A. Follower–Followee Ratio

The follower–followee ratio is the ratio of the number of

followees Nfollowee to the number of followers Nfollower, as

deﬁned in Equation (1).

follower–followee ratio =Nfollowee + 1

Nfollower + 1 (1)

In Equation (1), we add 1 to the denominator to avoid

devision by zero and to the numerator to guarantee that the

ratio of a user with equal number of followees and followers

become 1.

Figure 2a and 2b, respectively, show examples of users

with high and low follower–followee ratios. Users with a high

follower–followee ratio are those whose number of followees

Nfollowee is signigicantly outnumbered by the number of their

followers Nfollower. The reverse is the case for users with a

low follower–followee ratio.

B. User Classiﬁcation Using Follower–Followee Ratio

In this study, we classify users into four categories, A

through D, according to the follower–followee ratio. Category

A (Information Seeker) represents users with a follower–

followee ratio of 2.0 or higher, B (Friend) represents users

with a ratio between 1.0 and 1.25, C (Friend Hub) represents

users with a ratio between 0.8 and 1.0, and D (Information

Source) represents users with a ratio of 0.5 or lower. In this

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

FollowerFolloweeRatioCategoryandUserVectorforAnalyzingFollowingBehaviorHayatoOshimo,ShioriHironaka,MitsuoYoshiday,andKyojiUmemuraDepartmentofComputerScienceandEngineeringToyohashiUniversityofTechnologyAichi,JapanEmail:oshimo.hayato.zk@tut.jp,hironaka.shiori.ru@tut.jp,umemura@tut.jpyFacultyofBus...

收起<<

FollowerFollowee Ratio Category and User Vector for Analyzing Following Behavior Hayato Oshimo Shiori Hironaka Mitsuo Yoshiday and Kyoji Umemura.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

FollowerFollowee Ratio Category and User Vector for Analyzing Following Behavior Hayato Oshimo Shiori Hironaka Mitsuo Yoshiday and Kyoji Umemura

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: