A Simplied Algorithm for Identifying Abnormal Changes in Dynamic Networks Bouchaib Azamir1a Driss Bennis1band Bertrand Michel2c

2025-04-30 0 0 2.95MB 27 页 10玖币

侵权投诉

A Simpliﬁed Algorithm for Identifying Abnormal

Changes in Dynamic Networks

Bouchaib Azamir1,a, Driss Bennis1,b and Bertrand Michel2,c

1Faculty of Sciences, Mohammed V University in Rabat, Morocco.

abouchaib azamir@um5.ac.ma

bdriss.bennis@um5.ac.ma; driss bennis@hotmail.com

2Nantes Universit´e, Ecole Centrale Nantes, Laboratoire de math´ematiques Jean Leray

UMR 6629.

1 Rue de La Noe, 44300 Nantes, France.

cbertrand.michel@ec-nantes.fr

Abstract. Topological data analysis has recently been applied to the study of dynamic

networks. In this context, an algorithm was introduced and helps, among other things,

to detect early warning signals of abnormal changes in the dynamic network under study.

However, the complexity of this algorithm increases signiﬁcantly once the database studied

grows. In this paper, we propose a simpliﬁcation of the algorithm without aﬀecting its

performance. We give various applications and simulations of the new algorithm on some

weighted networks. The obtained results show clearly the eﬃciency of the introduced ap-

proach. Moreover, in some cases, the proposed algorithm makes it possible to highlight local

information and sometimes early warning signals of local abnormal changes.

2010 Mathematics Subject Classiﬁcation: 55N35; 55U99; 91B84

Key Words. Persistent homology; closeness centrality in a network; central subnetwork; time

series.

1 Introduction

A network is an eﬃcient representation of a given set of entities and the existing relationships

between them. Actually a network is a graph which entities are the vertices and the relations

between them are presented as edges (see for instance [14] and [17]). Usually, these relations are

quantiﬁed by (mostly positive) real numbers and in this case we say that the graph (the network) is

weighted. In most real-life situations, (weighted) networks evolve over time resulting in a family of

(weighted) networks; we then speak of dynamic (weighted) networks (see the beginning of Section

3). In general, there are two kinds of dynamic networks :

•There are dynamic networks such that both the set of vertices and the set of edges are

changing over time. For instance, chat groups in social networks where the members of these

groups are undergoing changes from moment to moment.

arXiv:2210.08891v1 [math.AT] 17 Oct 2022

A Simpliﬁed Algorithm for Identifying Abnormal Changes in Dynamic Networks 2

•In some situations, the set of vertices remains unchanged over time as in the case of ﬁnancial

networks such as the weighted networks representing stock market transactions. So, the only

changes occur at the level of the edges and at the level of the weights.

Dynamic networks have been subject of several studies. See for instance the papers and the

references within [20, 21, 22, 23, 24] where dynamic networks have been studied following diﬀerent

approaches. In this paper, we are interested in studying the behaviour of dynamic weighted net-

works using topological data analysis (TDA). In [12], Gidea used TDA to detect the early signs of

a ﬁnancial market crisis. His method is based mainly on the fact that we can associate with any

ﬁnancial network of stocks a metric space whose distance is deﬁned by the correlation coeﬃcients

of stock market returns. This allows to compute the so-called persistent homology of this metric

space (see the preliminaries in Section 2) and, thanks to the concept of time series, the ﬁrst signs

of a critical transition in the ﬁnancial network are detected. It is clear that Gidea’s method can

be applied on any other dynamic network. However, for large networks, the execution time of

the algorithms used in Gidea’s method will be very high. Thus, one can ask whether the fact of

working on particular subgraphs rather than the whole network permits to achieve good results in

a reasonable amount of time. Indeed, the abundance of real-world data and its velocity usually call

for data summarization. Namely, graph summarization is a useful tool to help identify the struc-

ture and meaning of data. It has various advantages including reducing data volume and storage,

speeding up algorithms and graph queries, eliminating noise, among others. There are diﬀerent

types of graph summaries depending on whether the data is homogeneous or heterogeneous and

also whether the network is static or dynamic. In the literature, graph summarization methods

use a set of basic techniques, here we are interested in those that are based on simpliﬁcation or

parsimony: these methods rationalize an input graph by removing nodes or edges less “important”,

resulting in a sparse graph see [25]. Here, the summary graph consists of a subset of the original

nodes and/or edges. A representative work on node simpliﬁcation-based summarization techniques

is OntoVis [30], a visual analysis tool for exploring and understanding large heterogeneous social

networks. In [6], the authors propose a four-step unsupervised algorithm for egocentric information

abstraction of heterogeneous social networks using edge, instead of node, ﬁltering. In [31], a new

problem, to simplify weighted graphs by removing less important edges, has been proposed. Simpli-

ﬁed graphs can be used to improve the visualization of a network, to extract its main structure, or

as a preprocessing step for other data mining algorithms. Our approach is part of the simpliﬁcation

of graphs and aims to identify a subnetwork that contains a large part of the information embodied

by the original network.

Our aim in this paper is to show that there are some subnetworks that can be good candidates

for this study. Those subnetworks are extracted from the whole graph by eliminating some edges

at certain threshold which is determined by the use of the closeness centrality concept (see Section

4 for the deﬁnition of closeness centrality). The idea is based on the fact that a node having the

highest closeness centrality is able to spread information eﬃciently through the whole graph (see

[28] for more details). Then, based on this fact, we propose that the desired subnetwork would be

the one which contains this node called henceforth central node. After that, we need to set the

threshold where edges with higher weights will be eliminated. Also, following the closeness central-

ity notion, the threshold will be chosen from the list of weights of the incident edges to the central

node. However, it is not clear how to determine the optimal threshold. This is why we propose to

focus our study on the three quartiles, the minimum and the maximum of the above proposed list

of weights. The associated subnetworks will be called central subnetworks (see Section 4 for more

details about centrality). Our aim is to determine what threshold levels provides good results. We

will see throught several examples that when the threshold is greater than the third quartiles, the

adopted approach reduces considerably the execution time of Gidea’s method (to almost half in

A Simpliﬁed Algorithm for Identifying Abnormal Changes in Dynamic Networks 3

some cases) without compromising quality and results. Moreover, in some cases we get good results

even with thresholds greater than the median.

This paper is organized as follows:

Section 2 presents some preliminaries on persistent homology. We start this section with a part

(Subsection 2.1) in which we present a short mathematical background that we need to understand

the persistent homology associated to weighted graphs (given in Subsection 2.2), and we present also

how to implement data to get persistence diagrams associated to weighted graphs (see Subsection

2.3).

In Section 3, we present how to associate persistent homology to dynamic networks (see Subsection

3.1) and show how it was used by Gidea to detect abnormal changes in ﬁnancial networks (see

Subsection 3.2).

Section 4 represents our approach to reduce the computation cost of Gidea’s method (see Figure

5 for the ﬂowchart of the proposed method). It is mainly based on a simpliﬁcation of the whole

studied network by considering what we called central subnetworks determined using the notion

of closeness centrality (see Algorithm 1). We also describe how to get the persistence diagrams

associated to these central subnetworks (see Algorithm 2).

Section 5 concerns the application phase. Namely, we present some numerical experiments in

which we compare our method with existing ones. In each of the ﬁrst three subsections, we simulate a

dynamic network. Each subsection deals with a kind of a weighted network whose vertex point cloud

is simulated with a given probability distribution. Then the corresponding central subnetworks are

derived. A visual comparison between the time series associated to both the whole network and

the central subnetwork shows that are very similar (see Figures in this section). This observation is

also conﬁrmed by the calculated R-squared coeﬃcients (see Tables in this section). In parallel, we

show that using our method the execution time is reduced considerably (see Tables in this section).

We end Section 5 by a comparison a performance comparison between our approach and the edge

collapse method (Subsection 5.4). It is known that the edge collapse method provides theoretical

guarantees and allows to obtain, in output, the same starting persistence diagram (see for instance

[16]). From Table 4, we deduce that our method is more practical for large and dense networks.

We end this article with Section 6 which includes an application of the method on real weighted

networks. In Subsection 6.1, we apply on the ﬁnancial network used by Gidea [12], and in Subsection

7 we consider a cryptocurrency network. As in Section 5, we deduce an improvement of the execution

time using the new method.

2 Persistent homology computed on graphs

Persistent homology is an emergent method for detecting geometrical and topological properties of

a space endowed with a topological structure. The concept of persistence was ﬁrst introduced by

Edelsbrunner, Letscher and Zomorodian in [8] and then reﬁned by Carlsson and Zomorodian in [3].

Since then, persistent homology has become an essential tool in topological Data Analysis (TDA)

and has been applied in various scientiﬁc ﬁelds such as biology, image processing, sensor networks

etc. Nowadays, there are many references (surveys, books and notes) on TDA; for a recent one we

refer to [5].

In this section, we give a short presentation on some basic notions concerning particular simplicial

complexes as well as its corresponding persistent homology. We will focus, following our context, on

simplicial complexes associated to graphs investigated in [2]. For a general background on persistent

A Simpliﬁed Algorithm for Identifying Abnormal Changes in Dynamic Networks 4

homology one can see [3, 5, 7, 11].

2.1 Simplicial complex and simplicial homology groups

We start this subsection with a recall of a simplicial complex and then we give the examples of

interest.

Given a ﬁnite set V, a simplicial complex with the vertex set Vis a set e

Kof ﬁnite subsets of V

such that the elements of Vbelong to e

Kand for any σ∈e

K, any subset of σbelongs to e

K. The

elements of e

Kare called the faces or the simplices of e

K. The dimension of a simplex of e

Kis just

its cardinality minus 1 and the dimension of the simplicial complex e

Kis the largest dimension of

its simplices [5].

There are a lot of examples of simplicial complexes arising from various contexts. Here, we use

the following ones:

Example 2.1 (Simplicial clique complex, [2]) Let G= (V, E)be a graph, where Vis the set

of vertices and Ethe set of edges. For a positive integer k, a k-clique of Gis a set of kvertices

whose the induced subgraph is complete. Denote by Cl(G)the set of all the cliques of G. Notice

that the set Cl(G)satisﬁes the two conditions :

•Cl(G)contains all singletons {v}with v∈V.

•Cl(G)is closed under subsets: if τ⊆σand σ∈Cl(G), then τ∈Cl(G).

Then, Cl(G)is a simplicial complex which is called a simplicial clique complex of G.

The following (geometric) classical example of simplicial complexes can be found in any intro-

ductory books on TDA.

Example 2.2 (Vietoris-Rips complex) Given a ﬁnite set of points Xin a metric space (M, d)

and a real number α≥0. The Vietoris-Rips complex Ripsα(X)is a simplicial complex whose

simplices are sets {x0, . . . , xk}such that d(xi, xj)≤αfor all 0≤i, j ≤k.

Now we are ready to recall what are simplicial homology groups of a simplicial complex. To this

end, we deﬁne some vector spaces and linear maps. Here, we only deal with vector spaces on the

ﬁeld F2:= Z/2Z.

Given a simplicial complex K, the vector space generated by the p-simplices of Kis denoted by

Cp(K). It consists of all ﬁnite formal sums of p-simplices called p-chains; i.e., an element cbelongs

to Cp(K) if it can be written as follows: c=Pjγjσjfor some scalers γj∈F2and a family (σj)j

of p-simplices.

For a positive integer p, consider the linear map ∂p:Cp(K)→Cp−1(K), called boundary map,

deﬁned on p-simplices as follows: For every p-simplex σ,∂(σ) is the formal sum of the (p−1)-

dimensional faces (i.e., subsets of σof cardinal p). An element of the image of ∂pis called a

boundary. Thus, the boundary ∂p(c) of a chain c=Pjγjσjis obtained by extending ∂plinearly;

i.e., ∂p(c) = Pjγj∂p(σj). The p-chains that have boundary 0 are called p-cycles. They form a

subspace Zpof Cp. The p-chains which are the boundary of (p+ 1)-chains are called p-boundaries

and form a subspace Bpof Cp. It is important but not diﬃcult to show that ∂p◦∂p+1 = 0 which is

equivalent to Bp⊆Zp. Then, if we consider the quotient vector space Zp/Bp, that means we have

annihilated the p-boundaries which are not p-cycles. For instance, if we take Kto be Cl(G), the

simplicial clique complex of a graph G, the dimension of the vector space B1/Z1will be exactly the

A Simpliﬁed Algorithm for Identifying Abnormal Changes in Dynamic Networks 5

number of “holes” in Gand the dimension of the vector space B0/Z0is the number of the connected

components of Cl(G) (i.e., the connected components of the graph (network) G). For this reason

and of course other important ones, the vector space Bp/Zpis considered as one of the principal

notions in algebraic topology that helps in describing topological features. It is denoted by Hp(K)

and usually called the pth simplicial homology group of K. Its elements are called homology classes.

Example 2.3 The simplicial clique complex Cl(G)associated to the graph Gin Figure 1 has three

connected components, and so the dimension of H0(Cl(G)) is three. But it has only one hole because

A9A10+A10A11+A11A12 +A12A9is a cycle but not a boundary. Notice that A5A6+A6A7+A7A8+

A8A5is a boundary of the 3-chain A5A6A8+A6A7A8.

Figure 1: A graph Gwith three connected components.

2.2 Persistence diagrams

When a simplicial complex Khas a history in the sense that it is an union of a chain of simplicial

complexes, we can capture more information on K; especially, the behaviour of homology groups

of these complexes can be used as a signature of the complex Kthat can be used to distinguish it

from other complexes. This is one of many important things that persistent homology is performing.

Consider the following chain of simplicial subcomplexes of a complex K, called a ﬁltration of K:

∅ ⊆ K0⊆K1⊆ · · · ⊆ Kp=K

This ﬁltration provides more information on K in the following sense: for every couple i, j such that

0≤i≤j≤p, the injection Ki→Kjinduces a homomorphism on the nth simplicial homology

groups for each dimension:

fi,j

n:Hn(Ki)→Hn(Kj).

The nth persistent Betti number βi,j

nis the rank of the vector space Im(fi,j

n); that is, βi,j

dim(Im(fi,j

n)). Persistent Betti numbers count how many homology classes of dimension nthat

survive during the passage from Kito Kj. We say that a homology class α∈Hn(Ki) is born

entering in Ki, if αdoes not come from a previous subcomplex; that is, α /∈Im(fi−1,i

n). Similarly,

if αis born in Ki, it dies entering Kjif the image of the map induced by Ki−1⊆Kj−1does

not contain the image of αbut the image of the map induced by Ki−1⊆Kjdoes. In this case

the persistence of αis j−i. Finally, all the collected data of births and deaths are presented in

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ASimpliedAlgorithmforIdentifyingAbnormalChangesinDynamicNetworksBouchaibAzamir1;a,DrissBennis1;bandBertrandMichel2;c1FacultyofSciences,MohammedVUniversityinRabat,Morocco.abouchaibazamir@um5.ac.mabdriss.bennis@um5.ac.ma;drissbennis@hotmail.com2NantesUniversite,EcoleCentraleNantes,Laboratoiredemath...

展开>> 收起<<

A Simplied Algorithm for Identifying Abnormal Changes in Dynamic Networks Bouchaib Azamir1a Driss Bennis1band Bertrand Michel2c.pdf

共27页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Simplied Algorithm for Identifying Abnormal Changes in Dynamic Networks Bouchaib Azamir1a Driss Bennis1band Bertrand Michel2c

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: