Grounded persistent path homology a stable topological descriptor for weighted digraphs A P REPRINT October 21 2022

2025-05-06 0 0 698.07KB 51 页 10玖币
侵权投诉
Grounded persistent path homology: a stable, topological
descriptor for weighted digraphs
A PREPRINT October 21, 2022
Thomas Chaplin1,*, Heather A. Harrington1, and Ulrike Tillmann1
1Mathematical Institute, University of Oxford
*Corresponding author: thomas.chaplin@maths.ox.ac.uk
Abstract
Weighted digraphs are used to model a variety of natural systems and can exhibit interesting
structure across a range of scales. In order to understand and compare these systems, we require
stable, interpretable, multiscale descriptors. To this end, we propose grounded persistent path
homology (GRPPH) – a new, functorial, topological descriptor that describes the structure of
an edge-weighted digraph via a persistence barcode. We show there is a choice of circuit basis
for the graph which yields geometrically interpretable representatives for the features in the
barcode. Moreover, we show the barcode is stable, in bottleneck distance, to both numerical and
structural perturbations. GRPPH arises from a flexible framework, parametrised by a choice of
digraph chain complex and a choice of filtration; for completeness, we also investigate replacing
the path homology complex, used in GRPPH, by the directed flag complex.
1 Introduction
Directed graphs with positive edge-weights arise both as natural objects of mathematical study
and as useful models of real-world systems (e.g. [3, 4, 30, 37]). A common task is to distinguish
between weighted digraphs. Frequently, this is achieved by defining an invariant, i.e. a map
I:WDgr X
from weighted digraphs into some set
X
, together with a metric
d
on
X
. The
metric allows us to quantitatively measure to what extent a pair of weighted digraphs differ.
When
I
is well understood we may be able to explain why the weighted digraphs differ. In order
to determine desirable characteristics of such an
I
, consider the examples shown in Figure 1, in
which edge weights correspond to length as drawn.
G1G2G3G4
Figure 1: Four example weighted digraphs; weights correspond to length as drawn.
Consider each
Gi
from the perspective of a particle flowing through the digraph, such
that the particle may only traverse an edge in the direction specified and the time it takes
corresponds to the weight. To the particle, loops (or circuits) in the graph are significant features.
However, loops can vary greatly based on the orientation and weight of constituent edges.
Despite sharing the same underlying undirected graph,
G1
and
G2
support very different
flows since
G1
has a single source and a single sink whereas
G2
has 4 sources and 2 sinks. To
reflect this,
d(I(G1)
,
I(G2))
should be large. In contrast,
G3
has a different undirected graph
but can be obtained from
G1
by simply subdividing each edge. In applications, this may arise
arXiv:2210.11274v1 [math.AT] 20 Oct 2022
Grounded persistent path homology: a stable, topological descriptor for weighted digraphs
from a finer resolution image of the same system. A suitable invariant should be relatively
stable to such subdivisions, ideally converging to a limiting value upon iterated subdivision.
Finally,
G4
has a higher circuit rank but the new loops are on a small scale, whilst the large
scale organisation is mostly similar to
G1
. Therefore,
d(I(G1)
,
I(G4))
should be small and the
difference between I(G1)and I(G4)should reflect this multiscale comparison.
For successful application, any invariant should be stable to a reasonable noise model. A
typical requirement is that
I
is continuous (or better yet Lipschitz), with respect to a choice
of metric on
WDgr
. Designing metrics for graphs is an active area of research but a common
choice is the graph edit distance [18]. For this metric, costs are assigned to operations such as
deleting an edge or modifying a weight, then the distance between two graphs is the minimal
cumulative cost of modifying one into the other. Since assigning costs to graph operations is
somewhat arbitrary, it is reasonable instead to require a bound on
d(I(G)
,
(I(OG))
, over a
range of graph operations, O:WDgr WDgr.
Finally, in many applications (particularly in biology), it is important that any invariant
I
is interpretable. That is, one must be able to explain why the invariant has the value it does.
Typically this is achieved through the identification of key contributing subgraphs.
In summary, we seek an invariant for weighted digraphs which
(a) distinguishes graphs with different flow profiles due to directionality;
(b) can detect and describe features (i.e. loops) across a range of scales;
(c) is stable to reasonable perturbations and converges under iterated subdivision; and
(d) is interpretable, e.g. through the identification of important subgraphs.
Pursuant to these goals, we employ two tools from topological data analysis (TDA) – path
homology and persistent homology. A number of homology theories for digraphs have been
developed (see summary in Section 2 of [7]). Path homology is one such theory [23], which is
sensitive to directionality and has useful functorial properties [12, 24]. Persistent homology is a
tool for developing stable descriptors [1] that extract relevant information in multiscale scenarios.
As such, persistent homology has seen successful applications to fields including neuroscience [6,
20, 21, 38], vasculature [32, 36] and financial networks [25], to name but a few [19]. A theory
of persistent path homology (PPH) was proposed by Chowdhury and M
´
emoli [12] and is a
stable descriptor for directed networks. In a search to develop an interpretable invariant for
weighted digraphs, which respects the inert topology of the underlying digraph, we are lead to
an alteration of PPH which we prove meets goals (a)-(d).
1.1 Contributions and outline
In Section 2 we give an overview of path homology and persistent homology, and set up the
categorical framework for the rest of the paper. In particular, we define a category of weighted
digraphs
ContWDgr
in which morphisms are digraph maps of the underlying digraphs as well
as contractions of the natural, shortest-path quasimetric.
In Section 3.1 we review a standard pipeline for extracting a topological invariant of a
weighted digraph, via PPH. Evaluating this invariant against our stated goals, motivates an
alteration of this pipeline, which we call the ‘grounded pipeline’. We define this new pipeline
and describe categories upon which the resulting invariant is functorial in Section 3.2.
Arising from the grounded pipeline, our main contribution is Definition/Theorem 3.15,
wherein we define grounded persistent path homology (GRPPH). This new invariant is a
functor gH1:ContWDgr PersVec (1.1)
where
PersVec
is the category of persistent vector spaces. Couching this definition in category
theory yields a strong framework for comparing weighted digraphs through the invariant.
2
Grounded persistent path homology: a stable, topological descriptor for weighted digraphs
Indeed, we use this functoriality later in the paper to aid the proof of decomposition and
stability results.
Section 4 is devoted to developing an interpretation of GRPPH. The early subsections are
dedicated to understanding the features detected by
gH1(G)
; the following theorem summarises
our findings.
Theorem 1.1.
Given a weighted digraph
GWDgr
, denote the underlying undirected graph by
U(G)
.
(a) All features in gH1(G)are born at t =0;
(b) gH1(G)at time t =0is the (real) cycle space of U(G); and moreover
(c) there is a choice of circuits in U(G)whose homology classes generate gH1(G).
These results demonstrate how GRPPH is sensitive to circuits in the digraph at all scales,
meeting goal (b), and can be interpreted through a persistence basis of such circuits, meeting
goal (d). We also discuss how to use
gH1(G)
to assign a ‘scale’ to any circuit in
U(G)
in
Section 4.2. In Section 4.4, we prove that if
G
can be decomposed into smaller parts, then
GRPPH also decomposes.
Theorem 1.2.
Given a weighted digraph
GWDgr
, if
G
decomposes as a wedge decomposition
G=G1ˆ
vG2or a disjoint union G =G1tG2then
gH1(G)
=gH1(G1)gH1(G2). (1.2)
In Section 5 we investigate the stability of
gH1
. We prove a number of bounds on the
bottleneck distance of the barcode upon perturbing the input weighted digraph. We find local
stability to operations such as weight perturbation, edge subdivision and certain classes of edge
collapses and edge deletions. In particular, edge subdivision stability automatically implies that
our invariant converges under iterated subdivision (Corollary 5.14). In contrast, the descriptor
is unstable to generic edge collapses and edge deletions, which we demonstrate through a
number of counter-examples. We argue that these stability properties suffice to meet goal (c)
and indeed stability to larger classes of edge collapses and edge deletion would be undesirable.
For a summary of all stability results obtained, please consult Table 1.
In order to build intuition for what is measured by GRPPH, we compute a number of
illustrative examples in Section 6. In particular, in Section 6.1, we consider a simple, cycle graph
and determine the limiting value of GRPPH under iterated edge subdivision. In Section 6.2, we
compute the invariant for a number of small square digraphs with varying edge orientations,
illustrating sensitivity to directionality, as required by goal (a).
GRPPH arises from the grounded pipeline after fixing a number of choices, including
path homology as a functor from digraphs to chain complexes. Another popular method for
producing chain complexes from digraphs is the directed flag complex [29]. Replacing path ho-
mology with directed flag complex homology yields an alternative descriptor, called grounded
persistent directed flag complex homology (GRPDFLH). For completeness, in Appendix A, we
revisit each of the results obtained in the main paper, replacing GRPPH with GRPDFLH. Of
particular note is Example A.13, in which we show that GRPDFLH is sensitive to a particularly
simple class of edge deletions. This stands in contrast to GRPPH, which is unaffected by these
deletions.
1.2 Computations
The software package
Flagser
allows the user to flexibly define filtrations of directed flag
complexes and subsequently compute persistent homology [28]. We use an alteration of
Flagser
to compute grounded persistent directed flag complex homology (available at [8]).
An algorithm for computing PPH in arbitrary degrees was proposed by Chowdhury and
M
´
emoli [12]; a more efficient algorithm for computing PPH in degree 1 was later proposed by
3
Grounded persistent path homology: a stable, topological descriptor for weighted digraphs
Dey, Li, and Wang [16]. A slight modification of the latter algorithm can be used to compute
GRPPH. A software package for computing both grounded homologies, as well as persistence
bases, in collaboration with G. Henselman-Petrusek, is forthcoming.
1.3 Acknowledgements
The first author would like to thank H. Byrne, A. Goriely, A.
´
O hEachteirn and T. Thompson for
valuable discussions which motivated and aided the early stages of this work. HAH gratefully
acknowledges funding from a Royal Society University Research Fellowship. The authors are
members of the Centre for Topological Data Analysis, which is funded by the EPSRC grant ‘New
Approaches to Data Science: Application Driven Topological Data Analysis’
EP/R018472/1
. For
the purpose of Open Access, the authors have applied a CC BY public copyright licence to any
Author Accepted Manuscript (AAM) version arising from this submission.
2 Background
2.1 Basic notation and category theory
We introduce some basic language for graphs and categories, and then recall the definitions of
path homology and persistent homology, the two theories we combine later in a new way to
define our invariant.
Notation 2.1. For dN, the standard d-simplex is
d:=(x0, . . . xd)Rd+1xi=1, and xi0i©. (2.1)
Notation 2.2.
Given a category
C
, we denote the collection of objects
Obj(C)
and the collection
of morphisms
Mor(C)
. For two objects
X
,
YObj(C)
, we denote the collection of morphisms
XY
by
MorC(X
,
Y)
. Where it is clear from object whether
α
is an object or a morphism, we
simply write αC.
Definition 2.3.
Given two categories
C
and
D
we denote the category of functors
CD
,
where morphisms are natural transformations, by
[C
,
D]
. For a morphism
fMor[C,D](M
,
N)
,
we denote the components of the natural transformation by fx:M(x)N(x)for each xC.
Lemma 2.4.
Given categories
C
,
D
and
D0
and a functor
µ:DD0
, there is a functor
[C
,
µ]:
[C,D][C,D0].
Proof.
Given
FObj([C
,
D])
, we map
[C
,
µ](F):=µF
. Given a natural transformation
ν:FF0
between functors
F
,
F0[C
,
D]
,
µν
is a natural transformation
µFµF0
.
This construction satisfies the usual functorial axioms.
Notation 2.5. (a)
We let
R
denote the poset of
R
equipped with the
relation, viewed as a
category.
(b)
We let
Vec
denote the category of
R
-vector spaces and
vec
denote the subcategory of
finite-dimensional R-vector spaces.
(c) We let Ch denote the category of chain complexes over R.
Definition 2.6.
Given a chain complex
CCh
, we denote the
kth
chain group by
Ck
and the
boundary map by
k:CkCk1
. For each
kN
, the
kth
homology group,
Hk(C)
is the
quotient
Hk(C):=ker k
im k+1
. (2.2)
We can view Hkas a functor Ch Vec.
4
Grounded persistent path homology: a stable, topological descriptor for weighted digraphs
2.2 Directed graphs
Definition 2.7. (a)
A(simple) digraph is a tuple
G= (V
,
E)
where
V
, the set of vertices, is a
finite set and E, the set of edges, is a subset of V×V\Vwhere
V:={(i,i)V×V|iV}. (2.3)
We call elements of Vself-loops.
(b) A directed acyclic graph (DAG) is a simple digraph G= (V,E)such that there is a linear
ordering on the nodes V={v1, . . . , vn}such that (vi,vj)E=i<j.
(c)
An oriented graph is a simple digraph
G= (V
,
E)
with no double edges, i.e.
(i
,
j)E=
(j,i)6E.
(d)
Aweighted (digraph/DAG/oriented graph) is a triple
G= (V
,
E
,
w)
such that
(V
,
E)
is a
(simple digraph/DAG/oriented graph) and
w:ER>0
is a positively-valued function
on the edges.
(e)
An undirected graph is a tuple
G= (V
,
E)
where
E
is a multiset of 2-element subsets of
V
.
(f)
Given a digraph
G= (V
,
E)
, the underlying undirected graph is
U(G):= (V
,
U(E))
where
(i,j)E,(j,i)6E=⇒ {i,j} ∈ U (E)with multiplicity 1, (2.4)
(i,j),(j,i)E=⇒ {i,j} ∈ U (E)with multiplicity 2. (2.5)
(g)
Given two digraphs
G1= (V1
,
E1)
,
G2= (V2
,
E2)
with
V1
,
V2V
, their union is
G1G2:=
(V1V2,E1E2). If V1,V2are disjoint, then we denote this G1tG2.
Notation 2.8. Fix a weighted digraph G= (V,E,w).
(a) We denote V(G):=V,E(G):=E,w(G):=w.
(b)
For an edge
e= (i
,
j)E
, we write
st(e):=i
and
fn(e):=j
, and say that
i
and
j
are
incident to e.
(c) For an edge e= (i,j)E, we write w(i,j):=w(e).
(d) We write ijto mean there is an edge (i,j)E.
Definition 2.9. Fix a weighted digraph G= (V,E,w).
(a)
Given
V0V
, the induced subgraph on
V0
is
(V0
,
E0
,
w0)
, where
E0=E(V0×V0)
and
w0is wrestricted to E0.
(b)
Given
E0E
, the induced subgraph on
E0
is
(V0
,
E0
,
w0)
, where
V0
is the set of all vertices
incident to some edges in E0and w0is wrestricted to E0.
(c) For vV,
Nin(v;G):={aV|av}, (2.6)
Nout(v;G):={bV|vb}, (2.7)
N(v;G):=Nin(v)∪ Nout(v)(2.8)
and the neighbourhood graph,
NG(v
;
G)
,is the induced subgraph on
N(v)
. Where
G
is
clear from context, we omit it from notation.
(d) For FE,
N(F;G):={τE|eFsuch that τ,eare incident to a common vertex}(2.9)
and the neighbourhood graph,
NG(F
;
G)
,is the induced subgraph on
N(F
;
G)
. Where
G
is clear from context, we omit it from notation.
5
摘要:

Groundedpersistentpathhomology:astable,topologicaldescriptorforweighteddigraphsAPREPRINTOctober21,2022ThomasChaplin1,*,HeatherA.Harrington1,andUlrikeTillmann11MathematicalInstitute,UniversityofOxford*Correspondingauthor:thomas.chaplin@maths.ox.ac.ukAbstractWeighteddigraphsareusedtomodelavarietyofnat...

展开>> 收起<<
Grounded persistent path homology a stable topological descriptor for weighted digraphs A P REPRINT October 21 2022.pdf

共51页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:51 页 大小:698.07KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 51
客服
关注