A spectral algorithm for nding maximum cliques in dense random intersection graphs Filippos Christodoulou10000000331763759 Sotiris

2025-05-01 0 0 4.29MB 21 页 10玖币
侵权投诉
A spectral algorithm for finding maximum
cliques in dense random intersection graphs ?
Filippos Christodoulou1[0000000331763759], Sotiris
Nikoletseas2,3[0000000337655636], Christoforos
Raptopoulos3[0000000298372632], and Paul Spirakis4,2[0000000153963749]
1Gran Sasso Science Institute, L’Aquila, Italy
filippos.christodoulou@gssi.it
2Computer Engineering & Informatics Department, University of Patras, Greece
{nikole,raptopox}@ceid.upatras.gr
3Computer Technology Institute & Press “Diophantus” (CTI), Patras, Greece
4Department of Computer Science, University of Liverpool, UK
P.Spirakis@liverpool.ac.uk
Abstract. In a random intersection graph Gn,m,p , each of nvertices
selects a random subset of a set of mlabels by including each label
independently with probability pand edges are drawn between vertices
that have at least one label in common. Among other applications, such
graphs have been used to model social networks, in which individuals
correspond to vertices and various features (e.g. ideas, interests)
correspond to labels; individuals sharing at least one common feature are
connected and this is abstracted by edges in random intersection graphs.
In this paper, we consider the problem of finding maximum cliques
when the input graph is Gn,m,p. Current algorithms for this problem
are successful with high probability only for relatively sparse instances,
leaving the dense case mostly unexplored. We present a spectral algorithm
for finding large cliques that processes vertices according to respective
values in the second largest eigenvector of the adjacency matrix of induced
subgraphs of the input graph corresponding to common neighbors of
small cliques. Leveraging on the Single Label Clique Theorem from [15],
we were able to construct random instances, without the need to externally
plant a large clique in the input graph. In particular, we used label choices
to determine the maximum clique and then concealed label information
by just giving the adjacency matrix of Gn,m,p as input to the algorithm.
Our experimental evaluation showed that our spectral algorithm clearly
outperforms existing polynomial time algorithms, both with respect to
the failure probability and the approximation guarantee metrics, especially
in the dense regime, thus suggesting that spectral properties of random
intersection graphs may be also used to construct efficient algorithms for
other NP-hard graph theoretical problems as well.
?Christoforos Raptopoulos was supported by the Hellenic Foundation for Research
and Innovation (H.F.R.I.) under the “2nd Call for H.F.R.I. Research Projects to
support Post-Doctoral Researchers” (Project Number:704).
Paul Spirakis was supported by the NeST initiative of the EEE and CS of the U. of
Liverpool and by the EPSRC grant EP/P02002X/1.
arXiv:2210.02121v1 [cs.DM] 5 Oct 2022
2 F. Christodoulou, S. Nikoletseas, C. Raptopoulos, P. Spirakis
Keywords: Random Intersection Graphs ·Maximum Cliques ·Heuristics
1 Introduction
Aclique in an undirected graph Gis a subset of vertices any two of which
are connected by an edge. The problem of finding the maximum clique in an
arbitrary graph is fundamental in Theoretical Computer Science and appears
in many different settings. As an example, consider a social network where
vertices represent people and edges represent mutual acquaintance. Finding a
maximum clique in this network corresponds to finding the largest subset of
people who all know each other. More generally, the analysis of large networks
in order to identify communities, clusters, and other latent structure has come
to the forefront of much research. The Internet, social networks, bibliographic
databases, energy distribution networks, and global networks of economies are
some of the examples motivating the development of the field.
From a computational complexity point of view, it is well known that
determining the size of the largest clique of an arbitrary graph of nvertices
is NP-complete [12]. This fact is further strengthened in [10], showing that, if k
is the size of the maximum clique, then the clique problem cannot be solved in
time no(k), unless the exponential time hypothesis fails. Additionally, there are
several results on hardness of approximation which suggest that there can be
no approximation algorithm with an approximation ratio significantly less than
linear (see e.g. [9]).
The intractability of the maximum clique problem for arbitrary graphs lead
researchers to the study of the problem for appropriately generated random
graphs. In particular, for Erd˝os-R´enyi random graphs Gn, 1
2(i.e. random graphs
of nvertices, in which each edge appears independently with probability 1
2), there
are several greedy algorithms that find a clique of size about log2nwith high
probability (whp, i.e. with probability that tends to 1 as ngoes to infinity), see
e.g. [8,13]. Since the clique number of Gn, 1
2is asymptotically equal to 2 log2n
with high probability, these algorithms approximate the clique number by a
factor of 2. It has been conjectured that finding a clique of size (1+ Θ(1)) log2n,
in a random graph instance Gn, 1
2, in which we have planted a randomly chosen
clique of size n0.49, with at least constant probability, would require techniques
beyond the current limits of complexity theory. This conjecture seems to identify
a certain bottleneck for the problem; finding the maximum clique in the case
where the planted clique has size at least ncan be done in polynomial time
by using spectral properties of the adjacency matrix of the graph (see [1]).
In this paper, we consider random instances of the random intersection graphs
model (introduced in [11,18]) as input graphs. In this model, denoted by Gn,m,p,
each one of mlabels is chosen independently with probability pby each one of
nvertices, and there are edges between any vertices with overlaps in the labels
chosen. One of the most interesting results regarding this model is that, when
the number of labels is sufficiently large (in particular, when m=nα, α 3)
the random intersection graphs model is equivalent to the Erd˝os-R´enyi random
A spectral algorithm for maximum cliques in RIGs 3
graphs model (in the sense that the total variation distance between the two
spaces tends to 0; see [6,17]). Random intersection graphs are relevant to and
capture quite nicely social networking. Indeed, a social network is a structure
made of nodes (individuals or organizations) tied by one or more specific types
of interdependency, such as values, visions, financial exchange, friends, conflicts,
web links etc. Social network analysis views social relationships in terms of nodes
and ties. Nodes are the individual actors within the networks and ties are the
relationships between the actors. Other applications include oblivious resource
sharing in a (general) distributed setting, efficient and secure communication in
sensor networks [14], interactions of mobile agents traversing the web etc. For
recent research related to random intersection graphs we refer the interested
reader to the surveys [4,3] and references therein.
1.1 Previous work on maximum cliques in random intersection
graphs.
In [18], the authors used the first moment probabilistic method to provide a lower
bound on the clique number of random instance of Gn,m,p in the case where mp2
tends to a constant as n→ ∞. In [15,16] this range of values was considerably
extended and a precise characterization of maximum cliques was given in the case
where m=nα, α < 1 and p=O(m1/2). In particular, the Single Label Clique
Theorem was proved, indicating that, with high probability any clique Qof size
|Q| ∼ np in a random instance of Gn,m,p (and thus also the maximum clique)
is formed by a single label. However, these structural results are existential and
thus do not lead to algorithms for finding the maximum clique. It is worth noting
that, the equivalence results between the random intersection graphs model and
the Erd˝os-R´enyi random graphs model for large number of vertices suggest that
the problem of finding a maximum clique in a random instance of Gn,m,p in this
range of values should not be any easier in the former that it is in the latter. On
the other hand, in the range of values m=nα, α < 1 and p=O(m2/3), greedy
algorithms for finding large cliques in random intersection graphs were presented
in the work [5]. The first algorithm in that paper, referred as GREEDY-CLIQUE,
finds a clique of the optimal order in a random instance of Gn,m,p with high
probability, in the case where the asymptotic degree distribution is a power-
law with exponent within (1,2). The algorithm considers vertices in decreasing
order of degree and greedily constructs a clique by extending by a vertex if
and only if the latter is connected to all other vertices already included; it can
be implemented to run in expected time O(n2). In the same paper [5], in the
case where the input graph is a random instance of Gn,m,p with bounded degree
variance, a second greedy algorithm, named MONO-CLIQUE, was suggested,
which can be implemented to run in expected time O(n). The main idea of
this algorithm is to try and construct a large clique directly by considering
common neighbours of endpoints of every edge in the graph. The pseudocodes
for GREEDY-CLIQUE and MONO-CLIQUE can be found in Appendixes 7.1
and 7.2 respectively.
4 F. Christodoulou, S. Nikoletseas, C. Raptopoulos, P. Spirakis
In [2] a more general greedy algorithm was presented, namely the Maximum-
Clique Algorithm, which constructs a large clique by considering the common
neighborhood of vertex subsets of fixed size k(i.e. independent of n) and checking
whether it forms a clique. From the cliques found in this way, it takes the largest
ones in order to cover the graph. This algorithm finds maximum cliques whp for
a wider range of parameters of the model (but still within the sparse regime)
than both algorithms GREEDY-CLIQUE and MONO-CLIQUE, at the cost of
larger running time. In particular, the Maximum-Clique Algorithm outputs a
maximum clique in a random instance of Gn,m,p with m=nα, α < 1 and
ln2n/n p=O(m2/3), with high probability. Since in this paper we consider
metrics regarding the ability of an algorithm to find large cliques (namely failure
probability and approximation guarantee), we use the Maximum-Clique Algorithm
as a benchmark in relation to which we evaluate our spectral algorithm. In fact,
we use a slightly more efficient version where we directly exclude k-subsets of
vertices that are not complete, in order to significantly reduce the n
kfactor
corresponding to the number of all k-sets in the running time. The pseudocode
of the benchmark algorithm is shown in Appendix 7.3. Different pruning ideas for
reducing the running time of greedy algorithms for finding large cliques through
the reduction of the size of the input graph, have been considered in [7].
2 Our contribution
In this paper we consider the problem of finding maximum cliques when the input
graph is Gn,m,p. We present a spectral algorithm for finding large cliques that
processes vertices according to respective values in the second largest eigenvector
of the adjacency matrix of carefully selected induced subgraphs of the input
graph created by common neighborhoods of small (constant size) k-cliques.
Because of the computation of the spectral decomposition, the running time
of our algorithm is larger than greedy algorithms in the relevant literature, but
it succeeds with higher probability in finding large cliques. In particular, we
compared our algorithm with the most efficient version of the Maximum-Clique
Algorithm from [2]. Leveraging on the Single Label Clique Theorem from [15],
we were able to avoid the construction of artificial input graph instances with
known planted large cliques. In particular, we used label choices to determine
the maximum clique and then concealed label information by just giving the
adjacency matrix of Gn,m,p as input to the algorithm. Our experimental evaluation
showed that, as we move from sparse instances to denser ones, both metrics
regarding the failure probability of our algorithm as well as the approximation
guarantee (when the maximum clique is not found) are much better than the
corresponding values for the Maximum-Clique algorithm. This difference is
especially highlighted as we move from sparser instances to denser ones, in which
there is no guarantee that greedy algorithms will succeed (but the Single Label
Clique Theorem still holds) and also as the size of the k-cliques used for creating
induced subgraphs increases (yet remains a relatively small constant, e.g. k=
6,7,8). We believe that our current paper suggests that spectral properties of
A spectral algorithm for maximum cliques in RIGs 5
random intersection graphs may be used to construct efficient algorithms for
other NP-hard graph theoretical problems as well.
3 Definitions, notation and useful results
Given an undirected graph G, we denote by V(G) and E(G) the set of vertices
and the set of edges respectively. Edges of Gwill be denoted as 2-sets; two vertices
v, u are connected in Gif and only if {u, v} ∈ E(G). For any vertex vV(G),
we denote by N(v) = NG(v) the set of neighbours of vin G, namely N(v)def
=
{uV(G) : {u, v} ∈ E(G)}. In addition, for any subset of vertices SV, we
denote by N(S) the set of vertices having at least one neighbor in S. We denote
by deg(v) = |N(v)|the degree of v. For any subset SV, we denote by G[S]
the induced subgraph of Gon S, namely G[S] = (S, {{u, v} ∈ E(G) : v, u S}).
Given an arbitrary ordering of the vertices, say v1, v2, . . . , v|V|, the adjacency
matrix AGof Gis an |V|×|V|matrix where AG[i, j] = 1 if {vi, vj} ∈ E(G)
and AG[i, j] = 0 otherwise. An eigenvector of AGwith corresponding eigenvalue
λis a vector xfor which AGx=λx. Since by definition AGis symmetric, it
has |V|real eigenvalues λ1λ2≥ ··· ≥ λ|V|, with orthogonal corresponding
eigenvectors x(1),x(2),...,x(|V|).
The formal definition of the random intersection graphs model is as follows:
Definition 1 (Random Intersection Graph - Gn,m,p [11,18]). Consider a
universe M={1,2, . . . , m}of labels and a set of nvertices V. Assign
independently to each vertex vVa subset Svof M, choosing each element
`∈ M independently with probability pand draw an edge between two vertices
v6=uif and only if SvSu6=. The resulting graph is an instance Gn,m,p of
the random intersection graphs model.
In this model we also denote by L`the set of vertices that have chosen
label `M. Given Gn,m,p, we refer to {L`, ` ∈ M} as its label representation.
Furthermore, the bipartite graph with vertex set V∪ M and edge set {(v, `) :
`Sv}={(v, `) : vL`}is the bipartite random graph Bn,m,p associated to
Gn,m,p. Notice that the associated bipartite graph is uniquely defined by the
label representation.
Given a graph G, a clique is a set of vertices every two of which are connected
by an edge; the size of the maximum clique in Gis its clique number. Notice that,
by definition, for any `the set of vertices within L`forms a clique in Gn,m,p.
Furthermore, the expected size of such a clique is E[|L`|] = np. Observe that
edges of cliques in Gn,m,p may be formed by different labels. However, when the
number of labels is smaller than the number of vertices, the following theorem
states that, under mild conditions, with high probability, in any large enough
clique of Gn,m,p, edges are formed by a single label.
Theorem 1 (Single Label Clique Theorem [16]). Let Gn,m,p be a random
instance of the random intersection graphs model with m=nα,0< α < 1and
mp2=O(1). Then whp, any clique Qof size |Q| ∼ np in Gn,m,p is formed by a
single label. In particular, the maximum clique is formed by a single label.
摘要:

Aspectralalgorithmfor ndingmaximumcliquesindenserandomintersectiongraphs?FilipposChristodoulou1[0000000331763759],SotirisNikoletseas2;3[0000000337655636],ChristoforosRaptopoulos3[0000000298372632],andPaulSpirakis4;2[0000000153963749]1GranSassoScienceInstitute,L'Aquila,Italyfilippos.christodoulou@gss...

展开>> 收起<<
A spectral algorithm for nding maximum cliques in dense random intersection graphs Filippos Christodoulou10000000331763759 Sotiris.pdf

共21页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:21 页 大小:4.29MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 21
客服
关注