Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values Asilata Bapat

2025-04-29 1 0 1.34MB 30 页 10玖币

侵权投诉

Computing the Matching Distance of 2-Parameter Persistence

Modules from Critical Values

Asilata Bapat∗

asilata.bapat@anu.edu.au

Robyn Brooks †

robyn.brooks@utah.edu

Celia Hacker‡

celia.hacker@epfl.ch

Claudia Landi§

claudia.landi@unimore.it

Barbara I. Mahler ¶

bmahler@kth.se

Elizabeth R. Stephenson ‖

elizasteprene@gmail.com

September 19, 2024

Abstract

The exact computation of the matching distance for multi-parameter persistence modules is an

active area of research in computational topology. Achieving an easily obtainable exact computation

of this distance would allow multi-parameter persistent homology to be a viable option for data

analysis. In this paper, we provide theoretical results for the computation of the matching distance

in two dimensions along with a geometric interpretation of the lines through parameter space realizing

this distance. In particular, we show that it is possible to avoid considering vertical and horizontal

lines. To prove this, lines with slope 1 play an essential role.

1 Introduction

Persistent Homology is known as one of the most-often used tools in Topological Data Analysis, allowing

one to use topological invariants to understand the underlying shape of data. In particular, single pa-

rameter persistence yields a summary of data through a one-dimensional ﬁltration, allowing an overview

of the data at many diﬀerent scales. Single parameter persistent homology has been the subject of much

study and has proven to be useful in many applications [2, 3, 16, 24, 28].

However, some data requires ﬁltration along multiple parameters to fully capture its information:

this is the role of multi-parameter persistent homology [10, 9]. In some contexts, it can be helpful to use

multiple parameters to capture the details of the data [11, 19, 26, 17, 29]. Additionally, single parameter

persistent homology is not robust to outliers in a point cloud; these outliers can lead to a misinterpretation

of the persistent homology, with the unnecessary creation or destruction of homology classes. Multi-

parameter persistence can be a natural ﬁx to this problem by adding a dimension depending on the

density of the samples (see, e.g., [15, 6]). Unfortunately, understanding, visualizing, and computing

invariants in multi-parameter persistent homology remains a diﬃcult task both mathematically and

computationally. This diﬃculty holds as well when it comes to computing distances between such

invariants.

In the single parameter case, there are several ways to compare persistent homology modules — such

as the bottleneck distance and Wasserstein distances — which exhibit some stability properties with

respect to variations in the input [14]. For more than one parameter, there are also several deﬁnitions of

∗Australian National University, Canberra, Australia

†Boston College, Massachusetts, United States

‡EPFL, Lausanne, Switzerland

§Universit`a di Modena e Reggio Emilia, DISMI, Italy

¶KTH Royal Institute of Technology Stockholm, Sweden

‖Institute of Science and Technology Austria, Klosterneuburg, Austria

arXiv:2210.12868v2 [math.AT] 17 Sep 2024

distances between persistence modules. Amongst them, the matching distance [27] attracts the attention

of multi-parameter persistence practitioners.

To compute the matching distance between two n-parameter persistence modules, one uses the fact

that restricting an n-parameter ﬁltration to any aﬃne line of positive slope through the parameter

space yields a 1-parameter ﬁltration. There is therefore a corresponding restriction of the n-parameter

persistence module to a 1-parameter module along that line. This construction allows for the knowledge

and computational methods from the 1-dimensional case to be applied to the n-dimensional case.

Indeed, following this idea, the matching distance is deﬁned as a supremum of the one-dimensional

bottleneck distance, over the collection of all lines of positive slope in the parameter space, i.e.,

dmatch(M, N) := sup

L:u=s ⃗m+b

ˆmL·dB(dgm ML,dgm NL),

where ˆmLis a normalization term explained in Section 2. However, exactly computing this distance is

not an easy task given the nature of its deﬁnition. As a ﬁrst step towards an exact computation, several

approximations of this distance have been provided for 2-dimensional persistence modules [4, 22]. The

exact computation of the matching distance is currently only possible for 2-dimensional modules [20],

with recent computational improvements in terms of time complexity [5]. These methods exploit the

duality of points and lines in the plane.

While approximated computations can be very useful in practice, computing the exact value of the

matching distance has its own merits. One of them is that it can distinguish modules for which the rank

invariant is a complete invariant, such as rectangle decomposable modules, as the matching distance is

a metric for the rank invariant. [7]

In this paper, we propose a method to compute the matching distance based on a reﬁnement of

the framework developed in [1], in which we compute the rank invariant of a multi-persistence module

from a ﬁnite set of critical parameter values. These critical parameter values capture all the changes

in homology occurring throughout the multi-ﬁltration. They may also be used to partition the set of

positive lines in the parameter space into equivalence classes, where each equivalence class maintains the

same birth and death ordering within the restricted module for all possible homology classes.

Based on these results, we formulate the idea that the critical values must be relevant to the choice

of lines for the computation of the matching distance. Leveraging the framework of [1] to compute the

matching distance allows us to reduce the number of lines necessary to compute the matching distance

to a ﬁnite set, thus reducing the computation to a maximum rather than a supremum without exploiting

the point-line duality used in [20]. Through some examples, we show that considering only lines passing

through pairs of critical values is not suﬃcient. This is because the deﬁnition of matching distance

depends on the bottleneck distance, and lines in the same equivalence class might not be such that the

bottleneck distance is always achieved by the same pairing of births/deaths, as will be further explained

in Section 3.

To overcome this problem, we analyze where switches might happen in the matching which achieves

the bottleneck distance, identifying a set of points in the parameter space which we later on refer to

as the set of switch points. This set allows us to reﬁne our equivalence relation on the set of positive

lines (see Section 4) by partitioning lines through the parameter space using the union of the critical

parameter values with the switch points. Using this union, it is possible to identify all the lines at which

the matching distance can be realised. We explain in detail how to compute all relevant parameter values

and show that the matching distance is attained either at: a line through a pair of points in this union,

or a line of diagonal slope through exactly one of the points in the union (see Theorem 3.1).

The contribution of this paper is to advance the state of the art, although still restricted to two dimen-

sions, in two ways: computing the matching distance in a way which is both geometrically interpretable

and implementable. In contrast to other methods such as [20, 5], we provide a geometric understanding

of diﬀerent lines, including horizontal, vertical, and diagonal lines, as well as lines through critical values,

and their contribution to the matching distance. In addition, the method we propose leads to explicit

computations of important lines, which will lead to an algorithm that can be implemented.

The paper is structured as follows. In Section 2, we give necessary background from multi-parameter

persistent homology, as well as some of the concepts coming from the framework developed in [1]. We

also provide some examples of two-parameter persistence modules for which we explicitly show which

lines are necessary to compute the matching distance. In Section 3, we provide the statement and proof

of our main theoretical result Theorem 3.1 and include a discussion on the role of vertical and horizontal

lines. In Section 4, we provide guidelines for the computation of the switch points, which are necessary

to the exact computation of the matching distance. Finally, in Section 6, we discuss future directions of

study.

2 Background

While we will subsequently specialize to the case of n= 2 parameters, we begin with some general

deﬁnitions to set the scene.

2.1 Notation and deﬁnitions

Deﬁnition 2.1 (Parameter Spaces).Let n∈N. If n= 1, endow Rwith the usual order ≤. If n > 1,

take the following partial order on Rn: for u= (ui), v = (vi)∈Rn, set u⪯vif and only if ui≤vifor

i= 1, . . . , n. The poset (Rn,⪯) is called an nD parameter space.

For n > 1, the nD parameter space can be sliced by lines with positive slope.

A line L⊂Rnis a 1Dparameter space when considered parameterized by s∈Ras L:u=⃗ms +b

where ⃗m ∈Rnand b∈Rn.Lhas positive slope if each coordinate miof ⃗m is positive: mi>0.

Throughout the paper, we consider the following standard normalisation for parameterizations of lines:

∥⃗m∥∞:= max{mi|i= 1, . . . , n}= 1,

i=1

bi= 0 (1)

This normalization is the one used in papers such as [12, 23]. Other choices are possible. Changing

the normalization would impact all the intermediate computations that follow but not the overall results.

Deﬁnition 2.2 (Persistence Modules).Let Fbe a ﬁxed ﬁeld. An (n-parameter) persistence module M

over the parameter space Rnis an assignment of an F-vector space Muto each u∈Rn, and linear maps,

called transition or internal maps,iM(u, v): Mu→Mvto each pair of points u⪯v∈Rn, satisfying the

following properties:

•iM(u, u) is the identity map for all u∈Rn.

•iM(v, w)◦iM(u, v) = iM(u, w) for all u⪯v⪯w∈Rn.

In this paper, persistence modules will always assumed to be ﬁnitely presented. To ﬁx the ideas, we

will assume that they are obtained applying homology in a ﬁxed degree, say q, over a ﬁxed coeﬃcient

ﬁeld F, to a tame and one-critical (cf. [1]) n-parameter ﬁltration Kof a ﬁnite simplicial complex: M=

Hq(K;F).

In the particular case when n= 1, a ﬁnitely presented persistence module Mcan be uniquely

decomposed as a ﬁnite sum of interval modules [30].

Deﬁnition 2.3 (Interval Module).An interval module is a 1-parameter persistence module of the form

I[b, d) with b<d≤∞∈R:= R∪ {∞} such that, for every s≤t∈R,

I[b, d)s=Fif b≤s < d

0 otherwise , I[b, d)(s≤t) = idFif b≤s≤t<d

0 otherwise

The interval [b, d) can be represented as a point (b, d) in R×R, above the diagonal. This encoding

permits deﬁning persistence diagrams.

Deﬁnition 2.4 (Persistence Diagram).If M∼

=Lj∈JI[bj, dj)mj, then the persistence diagram of M,

denoted dgm M, is the ﬁnite multi-set of points (bj, dj) of R×Rwith multiplicity mjfor j∈J.

We now review the deﬁnitions of bottleneck distance between persistence diagrams and matching

distance between two n-parameter persistence modules Mand N.

When n= 1, the bottleneck distance dBis an easily computable extended metric on persistence

diagrams [21], deﬁned as follows.

Deﬁnition 2.5 (Bottleneck Distance).Let M,Nbe two 1-parameter persistence modules, with persis-

tence diagrams dgm Mand dgm N.

Amatching between dgm Mand dgm Nis a multi-bijection σ:P→Qbetween the points of a

multi-subset Pin dgm Mand a multi-subset Qin dgm N.

The cost of a matching σ, denoted c(σ), is the greatest among the costs of each matched pair of

points, taken to be equal to the ℓ∞distance of the corresponding pair of points in R×R, with the

convention that ∞−∞= 0 and ∞ − a=∞for every a∈R, and the costs of each unmatched point,

taken to be equal to the ℓ∞distance of that point to the diagonal of R2:

c(σ) := max max

p∈P∥p−σ(p)∥∞,max

p /∈P`Q

p2−p1

2.

The bottleneck distance dBis deﬁned as the smallest possible cost of any matching σbetween dgm M

and dgm N:

dB(dgm M, dgm N) := min

σ:P→Q

P⊆dgm M,Q⊆dgm N

c(σ).

Observe that a matching σthat pairs a point (b, d)∈R2with a point (b′,∞)∈R×Rhas cost equal

to inﬁnity. Hence, the bottleneck distance between two persistence diagrams with a diﬀerent number of

points at inﬁnity cannot be ﬁnite. On the contrary, matching points of the same type always gives a

ﬁnite (more convenient) cost that can be expressed as follows.

Remark 2.6. By deﬁnition, the cost c(σ) of σ:P→Qtakes one of the two forms.

•If c(σ) is realized by matching p∈Pto q=σ(p), then c(σ) = |s−t|, where sand tare either both

abscissas or both ordinates of pand q, respectively;

•If c(σ) is realized by some pnot in P`Q, then c(σ) = |s−t|

2, where sand tare the abscissa and

the ordinate of p.

Brieﬂy, we can say that c(σ) is attained by some |s−t|

δ, with δ∈ {1,2}and s, t coordinates of points

in dgm M∪dgm N.

When the number of parameters is n≥2, we can use the bottleneck distance to deﬁne an extended

pseudo-metric between persistence modules Mand Nvia restrictions to lines with positive slope.

The restriction of Mto a line of positive slope L⊆Rnis the persistence module MLthat assigns

Muto each u∈L, and whose transition maps iML(u, v) : (ML)u→(ML)vfor u⪯v∈Lare the same

as in M. Once a parameterization u=⃗ms +bof Lis ﬁxed, the persistence module MLis isomorphic to

the 1-parameter persistence module, by abuse of notation still denoted by ML, that

•assigns to each s∈Rthe vector space (ML)s:= Mu, and

•whose transition map between (ML)sand (ML)tfor s≤tis equal to that of Mbetween Muand

Mvwith u=⃗ms +band v=⃗mt +b.

Deﬁnition 2.7 (Matching Distance).The matching distance between Mand Nis deﬁned by

dmatch(M, N) := sup

L:u=s ⃗m+b

ˆmL·dB(dgm ML,dgm NL)

where ˆmL:= min{mi|i= 1, . . . , n}>0 is the minimal coordinate of the direction vector ⃗m of L, and

Lvaries among all the lines of Rnwith positive slope, taken with standard normalization.

The role of the weight ˆmLin the deﬁnition of the matching distance is that of ensuring stability

of persistence modules against perturbations of the ﬁltrations that originate them (cf. [12, 23]). Such

weight would vanish for lines parallel to the coordinate axes. Therefore, such lines are not considered in

the deﬁnition of the matching distance.

In the context of the matching distance between multi-parameter persistence modules, it is more

convenient to include the factor ˆmLwhen computing the cost of a matching σLbetween the points of

a multi-subset in dgm MLand the points of a multi-subset in dgm NLfor a ﬁxed line Lwith positive

slope. Therefore, we introduce the following notation:

cost(σL) := ˆmLc(σL).

Examples of computation of the matching distance in some simple cases are given in Section 2.3.

By deﬁnition, the matching distance is computed by taking the supremum of the bottleneck distance

over all (inﬁnitely many) lines with positive slope through the parameter space. However, in [1], we

have shown that this set of lines may be partitioned into a ﬁnite number of equivalence classes. For the

two-parameter case (i.e., for n= 2), we will use these equivalence classes to generate a ﬁnite set of lines

from which one may compute the matching distance. Therefore, we introduce notation and deﬁnitions

necessary to understand these equivalence classes below.

Henceforth in this paper, we always assume n= 2, which is the case for which we have extremely

explicit results. We introduce the following deﬁnitions (with associated ﬁgures) following [1].

Deﬁnition 2.8 (Positive Cone).For every point uin R2, let S+(u) be the positive cone with vertex u:

S+(u) := {v∈R2:u⪯v}.

The boundary of the positive cone, ∂S+(u), decomposes into open faces. In particular, ∂S+(u) can

be partitioned by non-empty subsets Aof {1,2}in the following way. For ∅ ̸=A⊆ {1,2}, deﬁne

SA(u) := 









{u}if A={1,2},

{(x1, x2)∈R2|x1=u1, x2> u2}if A={1},

{(x1, x2)∈R2|x1> u1, x2=u2}if A={2}.

u=S{1,2}S{2}

S{1}S+(u)

Figure 1: The positive cone S+(u) of u∈R2and the decomposition of its boundary into S{1},S{2}and

S{1,2}, which correspond respectively to the vertical boundary, horizontal boundary, and u.

Deﬁnition 2.9. A line Lwith positive slope intersects ∂S+(u) in exactly one point, which we call the

push of uonto L, and denote by pushL(u):

pushL(u) := L∩∂S+(u).

pushL(u)

Figure 2: The push of ualong the line L. In this case, AL

u={2}, and so uis to the left of L.

Because of the partition of ∂S+(u) by its open faces, there is a unique non-empty subset of {1,2},

denoted AL

u, such that pushL(u) = L∩SAL

u(u). Concretely, in the plane, AL

u={1}means that uis

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ComputingtheMatchingDistanceof2-ParameterPersistenceModulesfromCriticalValuesAsilataBapat∗asilata.bapat@anu.edu.auRobynBrooks†robyn.brooks@utah.eduCeliaHacker‡celia.hacker@epfl.chClaudiaLandi§claudia.landi@unimore.itBarbaraI.Mahler¶bmahler@kth.seElizabethR.Stephenson‖elizasteprene@gmail.comSeptember...

展开>> 收起<<

Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values Asilata Bapat.pdf

共30页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Computing the Matching Distance of 2-Parameter Persistence Modules from Critical Values Asilata Bapat

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: