A spectral algorithm for nding maximum cliques in dense random intersection graphs Filippos Christodoulou10000000331763759 Sotiris

2025-05-01 0 0 4.29MB 21 页 10玖币

侵权投诉

A spectral algorithm for ﬁnding maximum

cliques in dense random intersection graphs ?

Filippos Christodoulou1[0000−0003−3176−3759], Sotiris

Nikoletseas2,3[0000−0003−3765−5636], Christoforos

Raptopoulos3[0000−0002−9837−2632], and Paul Spirakis4,2[0000−0001−5396−3749]

1Gran Sasso Science Institute, L’Aquila, Italy

filippos.christodoulou@gssi.it

2Computer Engineering & Informatics Department, University of Patras, Greece

{nikole,raptopox}@ceid.upatras.gr

3Computer Technology Institute & Press “Diophantus” (CTI), Patras, Greece

4Department of Computer Science, University of Liverpool, UK

P.Spirakis@liverpool.ac.uk

Abstract. In a random intersection graph Gn,m,p , each of nvertices

selects a random subset of a set of mlabels by including each label

independently with probability pand edges are drawn between vertices

that have at least one label in common. Among other applications, such

graphs have been used to model social networks, in which individuals

correspond to vertices and various features (e.g. ideas, interests)

correspond to labels; individuals sharing at least one common feature are

connected and this is abstracted by edges in random intersection graphs.

In this paper, we consider the problem of ﬁnding maximum cliques

when the input graph is Gn,m,p. Current algorithms for this problem

are successful with high probability only for relatively sparse instances,

leaving the dense case mostly unexplored. We present a spectral algorithm

for ﬁnding large cliques that processes vertices according to respective

values in the second largest eigenvector of the adjacency matrix of induced

subgraphs of the input graph corresponding to common neighbors of

small cliques. Leveraging on the Single Label Clique Theorem from [15],

we were able to construct random instances, without the need to externally

plant a large clique in the input graph. In particular, we used label choices

to determine the maximum clique and then concealed label information

by just giving the adjacency matrix of Gn,m,p as input to the algorithm.

Our experimental evaluation showed that our spectral algorithm clearly

outperforms existing polynomial time algorithms, both with respect to

the failure probability and the approximation guarantee metrics, especially

in the dense regime, thus suggesting that spectral properties of random

intersection graphs may be also used to construct eﬃcient algorithms for

other NP-hard graph theoretical problems as well.

?Christoforos Raptopoulos was supported by the Hellenic Foundation for Research

and Innovation (H.F.R.I.) under the “2nd Call for H.F.R.I. Research Projects to

support Post-Doctoral Researchers” (Project Number:704).

Paul Spirakis was supported by the NeST initiative of the EEE and CS of the U. of

Liverpool and by the EPSRC grant EP/P02002X/1.

arXiv:2210.02121v1 [cs.DM] 5 Oct 2022

2 F. Christodoulou, S. Nikoletseas, C. Raptopoulos, P. Spirakis

Keywords: Random Intersection Graphs ·Maximum Cliques ·Heuristics

1 Introduction

Aclique in an undirected graph Gis a subset of vertices any two of which

are connected by an edge. The problem of ﬁnding the maximum clique in an

arbitrary graph is fundamental in Theoretical Computer Science and appears

in many diﬀerent settings. As an example, consider a social network where

vertices represent people and edges represent mutual acquaintance. Finding a

maximum clique in this network corresponds to ﬁnding the largest subset of

people who all know each other. More generally, the analysis of large networks

in order to identify communities, clusters, and other latent structure has come

to the forefront of much research. The Internet, social networks, bibliographic

databases, energy distribution networks, and global networks of economies are

some of the examples motivating the development of the ﬁeld.

From a computational complexity point of view, it is well known that

determining the size of the largest clique of an arbitrary graph of nvertices

is NP-complete [12]. This fact is further strengthened in [10], showing that, if k

is the size of the maximum clique, then the clique problem cannot be solved in

time no(k), unless the exponential time hypothesis fails. Additionally, there are

several results on hardness of approximation which suggest that there can be

no approximation algorithm with an approximation ratio signiﬁcantly less than

linear (see e.g. [9]).

The intractability of the maximum clique problem for arbitrary graphs lead

researchers to the study of the problem for appropriately generated random

graphs. In particular, for Erd˝os-R´enyi random graphs Gn, 1

2(i.e. random graphs

of nvertices, in which each edge appears independently with probability 1

2), there

are several greedy algorithms that ﬁnd a clique of size about log2nwith high

probability (whp, i.e. with probability that tends to 1 as ngoes to inﬁnity), see

e.g. [8,13]. Since the clique number of Gn, 1

2is asymptotically equal to 2 log2n

with high probability, these algorithms approximate the clique number by a

factor of 2. It has been conjectured that ﬁnding a clique of size (1+ Θ(1)) log2n,

in a random graph instance Gn, 1

2, in which we have planted a randomly chosen

clique of size n0.49, with at least constant probability, would require techniques

beyond the current limits of complexity theory. This conjecture seems to identify

a certain bottleneck for the problem; ﬁnding the maximum clique in the case

where the planted clique has size at least √ncan be done in polynomial time

by using spectral properties of the adjacency matrix of the graph (see [1]).

In this paper, we consider random instances of the random intersection graphs

model (introduced in [11,18]) as input graphs. In this model, denoted by Gn,m,p,

each one of mlabels is chosen independently with probability pby each one of

nvertices, and there are edges between any vertices with overlaps in the labels

chosen. One of the most interesting results regarding this model is that, when

the number of labels is suﬃciently large (in particular, when m=nα, α ≥3)

the random intersection graphs model is equivalent to the Erd˝os-R´enyi random

A spectral algorithm for maximum cliques in RIGs 3

graphs model (in the sense that the total variation distance between the two

spaces tends to 0; see [6,17]). Random intersection graphs are relevant to and

capture quite nicely social networking. Indeed, a social network is a structure

made of nodes (individuals or organizations) tied by one or more speciﬁc types

of interdependency, such as values, visions, ﬁnancial exchange, friends, conﬂicts,

web links etc. Social network analysis views social relationships in terms of nodes

and ties. Nodes are the individual actors within the networks and ties are the

relationships between the actors. Other applications include oblivious resource

sharing in a (general) distributed setting, eﬃcient and secure communication in

sensor networks [14], interactions of mobile agents traversing the web etc. For

recent research related to random intersection graphs we refer the interested

reader to the surveys [4,3] and references therein.

1.1 Previous work on maximum cliques in random intersection

graphs.

In [18], the authors used the ﬁrst moment probabilistic method to provide a lower

bound on the clique number of random instance of Gn,m,p in the case where mp2

tends to a constant as n→ ∞. In [15,16] this range of values was considerably

extended and a precise characterization of maximum cliques was given in the case

where m=nα, α < 1 and p=O(m−1/2). In particular, the Single Label Clique

Theorem was proved, indicating that, with high probability any clique Qof size

|Q| ∼ np in a random instance of Gn,m,p (and thus also the maximum clique)

is formed by a single label. However, these structural results are existential and

thus do not lead to algorithms for ﬁnding the maximum clique. It is worth noting

that, the equivalence results between the random intersection graphs model and

the Erd˝os-R´enyi random graphs model for large number of vertices suggest that

the problem of ﬁnding a maximum clique in a random instance of Gn,m,p in this

range of values should not be any easier in the former that it is in the latter. On

the other hand, in the range of values m=nα, α < 1 and p=O(m−2/3), greedy

algorithms for ﬁnding large cliques in random intersection graphs were presented

in the work [5]. The ﬁrst algorithm in that paper, referred as GREEDY-CLIQUE,

ﬁnds a clique of the optimal order in a random instance of Gn,m,p with high

probability, in the case where the asymptotic degree distribution is a power-

law with exponent within (1,2). The algorithm considers vertices in decreasing

order of degree and greedily constructs a clique by extending by a vertex if

and only if the latter is connected to all other vertices already included; it can

be implemented to run in expected time O(n2). In the same paper [5], in the

case where the input graph is a random instance of Gn,m,p with bounded degree

variance, a second greedy algorithm, named MONO-CLIQUE, was suggested,

which can be implemented to run in expected time O(n). The main idea of

this algorithm is to try and construct a large clique directly by considering

common neighbours of endpoints of every edge in the graph. The pseudocodes

for GREEDY-CLIQUE and MONO-CLIQUE can be found in Appendixes 7.1

and 7.2 respectively.

4 F. Christodoulou, S. Nikoletseas, C. Raptopoulos, P. Spirakis

In [2] a more general greedy algorithm was presented, namely the Maximum-

Clique Algorithm, which constructs a large clique by considering the common

neighborhood of vertex subsets of ﬁxed size k(i.e. independent of n) and checking

whether it forms a clique. From the cliques found in this way, it takes the largest

ones in order to cover the graph. This algorithm ﬁnds maximum cliques whp for

a wider range of parameters of the model (but still within the sparse regime)

than both algorithms GREEDY-CLIQUE and MONO-CLIQUE, at the cost of

larger running time. In particular, the Maximum-Clique Algorithm outputs a

maximum clique in a random instance of Gn,m,p with m=nα, α < 1 and

ln2n/n ≤p=O(m−2/3), with high probability. Since in this paper we consider

metrics regarding the ability of an algorithm to ﬁnd large cliques (namely failure

probability and approximation guarantee), we use the Maximum-Clique Algorithm

as a benchmark in relation to which we evaluate our spectral algorithm. In fact,

we use a slightly more eﬃcient version where we directly exclude k-subsets of

vertices that are not complete, in order to signiﬁcantly reduce the n

kfactor

corresponding to the number of all k-sets in the running time. The pseudocode

of the benchmark algorithm is shown in Appendix 7.3. Diﬀerent pruning ideas for

reducing the running time of greedy algorithms for ﬁnding large cliques through

the reduction of the size of the input graph, have been considered in [7].

2 Our contribution

In this paper we consider the problem of ﬁnding maximum cliques when the input

graph is Gn,m,p. We present a spectral algorithm for ﬁnding large cliques that

processes vertices according to respective values in the second largest eigenvector

of the adjacency matrix of carefully selected induced subgraphs of the input

graph created by common neighborhoods of small (constant size) k-cliques.

Because of the computation of the spectral decomposition, the running time

of our algorithm is larger than greedy algorithms in the relevant literature, but

it succeeds with higher probability in ﬁnding large cliques. In particular, we

compared our algorithm with the most eﬃcient version of the Maximum-Clique

Algorithm from [2]. Leveraging on the Single Label Clique Theorem from [15],

we were able to avoid the construction of artiﬁcial input graph instances with

known planted large cliques. In particular, we used label choices to determine

the maximum clique and then concealed label information by just giving the

adjacency matrix of Gn,m,p as input to the algorithm. Our experimental evaluation

showed that, as we move from sparse instances to denser ones, both metrics

regarding the failure probability of our algorithm as well as the approximation

guarantee (when the maximum clique is not found) are much better than the

corresponding values for the Maximum-Clique algorithm. This diﬀerence is

especially highlighted as we move from sparser instances to denser ones, in which

there is no guarantee that greedy algorithms will succeed (but the Single Label

Clique Theorem still holds) and also as the size of the k-cliques used for creating

induced subgraphs increases (yet remains a relatively small constant, e.g. k=

6,7,8). We believe that our current paper suggests that spectral properties of

A spectral algorithm for maximum cliques in RIGs 5

random intersection graphs may be used to construct eﬃcient algorithms for

other NP-hard graph theoretical problems as well.

3 Deﬁnitions, notation and useful results

Given an undirected graph G, we denote by V(G) and E(G) the set of vertices

and the set of edges respectively. Edges of Gwill be denoted as 2-sets; two vertices

v, u are connected in Gif and only if {u, v} ∈ E(G). For any vertex v∈V(G),

we denote by N(v) = NG(v) the set of neighbours of vin G, namely N(v)def

{u∈V(G) : {u, v} ∈ E(G)}. In addition, for any subset of vertices S⊆V, we

denote by N(S) the set of vertices having at least one neighbor in S. We denote

by deg(v) = |N(v)|the degree of v. For any subset S⊆V, we denote by G[S]

the induced subgraph of Gon S, namely G[S] = (S, {{u, v} ∈ E(G) : v, u ∈S}).

Given an arbitrary ordering of the vertices, say v1, v2, . . . , v|V|, the adjacency

matrix AGof Gis an |V|×|V|matrix where AG[i, j] = 1 if {vi, vj} ∈ E(G)

and AG[i, j] = 0 otherwise. An eigenvector of AGwith corresponding eigenvalue

λis a vector xfor which AGx=λx. Since by deﬁnition AGis symmetric, it

has |V|real eigenvalues λ1≥λ2≥ ··· ≥ λ|V|, with orthogonal corresponding

eigenvectors x(1),x(2),...,x(|V|).

The formal deﬁnition of the random intersection graphs model is as follows:

Deﬁnition 1 (Random Intersection Graph - Gn,m,p [11,18]). Consider a

universe M={1,2, . . . , m}of labels and a set of nvertices V. Assign

independently to each vertex v∈Va subset Svof M, choosing each element

`∈ M independently with probability pand draw an edge between two vertices

v6=uif and only if Sv∩Su6=∅. The resulting graph is an instance Gn,m,p of

the random intersection graphs model.

In this model we also denote by L`the set of vertices that have chosen

label `∈M. Given Gn,m,p, we refer to {L`, ` ∈ M} as its label representation.

Furthermore, the bipartite graph with vertex set V∪ M and edge set {(v, `) :

`∈Sv}={(v, `) : v∈L`}is the bipartite random graph Bn,m,p associated to

Gn,m,p. Notice that the associated bipartite graph is uniquely deﬁned by the

label representation.

Given a graph G, a clique is a set of vertices every two of which are connected

by an edge; the size of the maximum clique in Gis its clique number. Notice that,

by deﬁnition, for any `the set of vertices within L`forms a clique in Gn,m,p.

Furthermore, the expected size of such a clique is E[|L`|] = np. Observe that

edges of cliques in Gn,m,p may be formed by diﬀerent labels. However, when the

number of labels is smaller than the number of vertices, the following theorem

states that, under mild conditions, with high probability, in any large enough

clique of Gn,m,p, edges are formed by a single label.

Theorem 1 (Single Label Clique Theorem [16]). Let Gn,m,p be a random

instance of the random intersection graphs model with m=nα,0< α < 1and

mp2=O(1). Then whp, any clique Qof size |Q| ∼ np in Gn,m,p is formed by a

single label. In particular, the maximum clique is formed by a single label.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Aspectralalgorithmforndingmaximumcliquesindenserandomintersectiongraphs?FilipposChristodoulou1[0000000331763759],SotirisNikoletseas2;3[0000000337655636],ChristoforosRaptopoulos3[0000000298372632],andPaulSpirakis4;2[0000000153963749]1GranSassoScienceInstitute,L'Aquila,Italyfilippos.christodoulou@gss...

展开>> 收起<<

A spectral algorithm for nding maximum cliques in dense random intersection graphs Filippos Christodoulou10000000331763759 Sotiris.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A spectral algorithm for nding maximum cliques in dense random intersection graphs Filippos Christodoulou10000000331763759 Sotiris

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: