2 N. Muttakin et al.
community detection is an essential tool to analyze different kinds of networks
such as social networks, research topics networks, protein-protein interaction
networks, ecological networks, financial networks, etc., and to understand their
structure.
The communities in most real-world networks, especially social networks,
overlap [
70
]. The same participant participates in different communities at the
same time. Sometimes, a participant in a community might simultaneously be
a different community member. A person in a social network is in his or her
family’s community while he/she is also in the community of his or her coworkers.
In reality, a node may belong to an unlimited number of communities in a social
network. This overlapping node is critical for various reasons, as it might be
used positively and negatively. For instance, in a social network context, a node
belonging to two disputing communities might be potentially used to resolve the
dispute. On the contrary, this node may disclose sensitive information about a
community to other communities violating privacy ethics. Indeed, the overlap
is an inherent characteristic of many real-world social networks [
29
]. Moreover,
disjoint community detection induces a partition in the network that eventually
breaks the communities [
44
]. Therefore, detecting overlapping communities in
different networks is of utmost importance, and thus, it has attracted significant
attention in the research community [66].
Many works in the literature are related to non-overlapping community
detection [
1
,
58
]. In the case of non-overlapping community detection, node
embedding approaches perform well [
55
,
8
]. However, most of these approaches
are not well suited for overlapping community detection due to the lack of scalable
and reliable clustering methods for high-dimensional vectors. A viable approach
to resolve this issue is to generate the node embedding as community affiliation,
which can be done effectively and efficiently using GCNs [50].
Among the classical approaches for overlapping community detection, label
propagation algorithms are prominent [
37
]. For instance, SLPA is an overlapping
community detection algorithm based on the idea of label propagation [
67
]. DE-
MON [
15
], a node-centric bottom-up overlapping community detection algorithm,
also leverages ego network structure with overlapping label propagation. Other
algorithmic approaches include seed-set expansion [
3
,
18
], evolutionary algorithms
[
36
], and heuristics-based methods [
47
,
17
,
21
,
34
]. Some heuristics-based works
considered game-theoretic concepts [62].
Many studies consider some kinds of matrix factorization along with proba-
bilistic inference [
69
,
74
,
54
,
31
,
30
,
33
,
59
]. One of the earliest works in this regime
is BigClam [
69
]. BigClam was designed for overlapping community detection
in large networks [
69
]. BigClam uses gradient ascent to create an embedding,
which is later used to determine nodes’ community affiliation. MNMF [
60
] uses
modularity-based regularization with non-negative matrix factorization. DANMF
uses a telescopic non-negative matrix factorization approach based on a deep
auto-encoder to learn membership distribution over nodes [
73
]. Some works [
7
,
72
]
performed factorization of modularity matrices using neural nets, while other
works considered adversarial learning [
28
,
11
] or deep belief networks [
26
] to learn