
Topological Singularity Detection at Multiple Scales
3. Related Work
While manifold learning is concerned with the development
of algorithms that extract geometric information under the
assumption that the given data lie on a manifold, recent
work starts to question this assumption. Brown et al. (2023),
for instance, introduce the union of manifolds hypothesis,
which augments the manifold hypothesis to spaces that can
be modelled as (disjoint) unions of manifolds. Intrinsic
dimension is thus allowed to vary across connected compon-
ents of such a space. However, singularities are excluded
under this assumption, whereas our method detects the cor-
rect intrinsic dimension for large classes of singular spaces.
We assume a fundamental ‘singularity-centric’ perspective
in this paper and argue that a multi-scale analysis of the local
geometry and topology of data is necessary. In this context,
methods from topological data analysis have started attract-
ing attention in machine learning (Hensel et al.,2021). This
is particularly due to persistent homology, which captures
geometrical and topological properties of the underlying
data set on different scales (Bubenik et al.,2020;Turke
ˇ
s
et al.,2022). The idea of tracking objects on multiple scales
can at least be traced back to (Koenderink & van Doorn,
1986;Lindeberg,1994), with scale space theory playing an
eminent role in computer vision. However, the utility of
persistent homology in the context of geometric singularit-
ies in data only came up more recently, since early work in
persistent homology focuses predominantly on the simplific-
ation of functions on manifold domains (Edelsbrunner et al.,
2002). While some research already discusses the utility
of persistent homology for general unsupervised data ana-
lysis workflows (Chazal et al.,2013;Rieck & Leitte,2017;
2016;2015), it focuses more on global structures, whereas
singularities are inherently local. A notable exception is a
work by Wang et al. (2011), which analyses local branching
and global circular features. We give a brief overview of
methods in the emerging field of topology-driven singularity
detection, outlining the differences to our approach below.
Topology-driven singularity detection. Several works as-
sume a local perspective on homology to detect information
about the intrinsic dimensionality of the data or the presence
of certain singularities. Rieck et al. (2020) use pre-defined
stratifications and persistent intersection homology, a tech-
nique developed by Bendich & Harer (2011), whereas Fasy
& Wang (2016) and Bendich (2008) both develop persistent
variants of local homology. By contrast, Stolz et al. (2020)
approximate local homology as the absolute homology of
a small annulus of a given neighbourhood, resulting in an
algorithm for geometric anomaly detection (which requires
knowing the intrinsic dimension of the data set). Moreover,
Bendich et al. (2007) employ persistence vineyards, i.e. con-
tinuous families of persistence diagrams, to assess the local
homology of a point in a stratified space, whereas Dey et al.
(2014) use local homology to estimate the (global) intrinsic
dimension of hidden, possibly noisy manifolds.
Key differences to existing approaches. While existing
methods overall underscore the relevance of a local perspect-
ive, as well as the use of notions such as stratified spaces,
our approach differs from them in essential components.
In comparison to all aforementioned contributions, we cap-
ture additional local geometric information: we consider
multiple scales of locality in a persistent framework for
local homology. Concerning the overall construction, Stolz
et al. (2020) is the closest to our method. However, the
authors assume that the intrinsic dimension is known and
the proposed algorithm uses one global scale, whereas our
approach (i) operates in a multi-scale setting, (ii) provides
local estimates of intrinsic dimensionality of the data space,
and (iii) incorporates model spaces that serve as a compar-
ison. We can thus measure the deviation from an idealised
manifold, requiring fewer assumptions on the structure of
the input data (see Appendix A.7 for a comparison).
4. Methods
Our framework TARDIS (Topological Algorithm for Robust
DIscovery of Singularities) consists of two parts: (i) a
method to calculate a local intrinsic dimension of the data,
and (ii) Euclidicity, a measure for assessing the multi-scale
deviation from a Euclidean space. TARDIS is based on the
assumption that the intrinsic dimension of data may not be
constant across the data set, and is thus best described by
local measurements, i.e. measurements in a small neighbour-
hood of a given point. Since there is no canonical choice for
the magnitude of such a neighbourhood, TARDIS analyses
data on multiple scales. Our main idea involves construct-
ing a collection of local (punctured) neighbourhoods for
varying locality scales, and calculating their topological
features. This procedure allows us to approximate local to-
pological features (specifically, local homology) of a given
point, which we use to measure the intrinsic dimensional-
ity of a space. Moreover, by calculating the distance to
Euclidean model spaces, we are capable of detecting singu-
larities in a large range of input data sets. For the subsequent
description of TARDIS, we only assume that data can be
represented as a finite metric space (i.e. as a point cloud).
4.1. Persistent Intrinsic Dimension
For a finite metric space
(X,d)
and
x∈X
, let
Bs
r(x) :=
{y∈X|r≤d(x, y)≤s}
denote the intrinsic annulus
of
x
in
X
with respect to radii
r
and
s
. Moreover, let
F
denote a procedure that takes as input a finite metric
space and outputs an ascending filtration of topological
spaces—such as a Vietoris–Rips filtration. By applying
F
to the intrinsic annulus of
x
, we obtain a tri-filtration
3