A. Lidar Place Recognition
LPR utilising 3D point clouds has been significantly ex-
plored in the last few years. LPR approaches identify similar
places (revisited areas) by encoding high-dimensional point
clouds into discriminative embeddings (often referred to as
descriptors). Handcrafted LPR methods [19]–[23] generate
local descriptors by segmenting point clouds into patches, or
global descriptors that show the relationship between all the
points in a point cloud.
Recent state-of-the-art LPR approaches have been dom-
inated by deep learning-based architectures due to their
impressive performance [2]–[6], [24]–[28]. These approaches
typically utilise a backbone architecture to extract local
features from the point cloud, which are then aggregated into
a global descriptor. The specific design of these components
varies significantly between different works; PointNet [24],
graph neural networks [3], transformers [2], [6], [25], and
sparse-voxel convolutional networks [4]–[6], [26] have all
been proposed as local feature extractors, and aggregation
methods include NetVLAD [29], Generalised Mean Pooling
(GeM) [30] and second-order pooling [5], [31].
B. Uncertainty Estimation for Retrieval Tasks
Though there are a number of works exploring uncertainty
estimation in lidar object detection [15], [17], [18], [32],
[33] and point cloud segmentation [16], no existing works
explore uncertainty estimation for LPR. While recent work
by Knights et al. [34] shares a similar motivation to our work
– reliable performance in novel environments – they explore
incremental learning and specifically the issue of catastrophic
forgetting.
Image retrieval is a field of computer vision that shares
a similar problem setup to LPR (though notably operat-
ing on images rather than point clouds). When estimat-
ing uncertainty for image retrieval, recent works learn an
uncertainty estimate by adding additional heads to their
network architecture [13], [14], [35]. Shi et al. [13] examine
uncertainty-aware facial recognition, where face embeddings
are modelled as Gaussian distributions by learning both a
mean vector and variance vector.
Warburg et al. [35] follow a similar approach, introducing
a ‘Bayesian Triplet Loss’ to extend training to also include
negative probabilistic embeddings. Most recently, STUN [14]
was proposed for uncertainty-aware visual place recognition.
STUN presents a student-teacher paradigm to learn a mean
vector and variance vector, using the average variance to
represent uncertainty [14]. Given the high performance of
these approaches in the related image retrieval task, we adapt
several of these methods to the LPR setting to serve as
baselines for our benchmark.
III. METHODOLOGY
We first define the LPR task, and then formalise
uncertainty-aware LPR. Following this, we introduce the
baseline methods used for our benchmark.
A. Lidar Place Recognition
During LPR evaluation, a database contains point clouds
with attached location information. This database can be a
previously curated map or can be collected online as an
agent explores an environment. Given a query, i.e., a new
point cloud from an unknown location, an LPR model must
localise the query by finding the matching point cloud in
the database. If the predicted database location is within a
minimum global distance to the true query location, the pre-
diction is considered correctly recalled. In this configuration,
LPR performance is evaluated from the average recall of all
tested queries [1], [4], [6].
We are motivated by the observation that perfect recall
in an LPR setting does not currently exist, and may not be
attainable in some applications – when operating in dynamic
and evolving environments, or dealing with sensor noise,
the potential for error always exists. In this case, we argue
that LPR models should additionally be able to estimate
uncertainty in their predictions, i.e.,know when they don’t
know. We formalise this below as uncertainty-aware LPR.
B. Uncertainty-aware Lidar Place Recognition
In uncertainty-aware LPR, each predicted match between a
query and database entry should be accompanied by a scalar
uncertainty estimate U. This uncertainty represents the lack
of confidence in a predicted location.
Following the existing LPR setup, the primary goal in
uncertainty-aware LPR is to maximise correct localisations
(i.e., recall). Uncertainty-aware LPR extends on this by addi-
tionally requiring models to identify incorrect predictions by
associating high uncertainty. We formulate this as a binary
classification problem, where Uis compared to a decision
threshold λto classify whether an LPR prediction is correct
or incorrect:
Fλ(U) = Correct, U ≤λ
Incorrect, U > λ . (1)
Incorrect predictions can arise for two reasons: (1) the
query is from a location that is not present in the database,
or (2) the query is from a location in the database, but the
LPR model selects the incorrect database match. We refer to
these two error types as ‘no match error’ and ‘incorrect match
error’ respectively, and analyse them in detail in Sec. V-B.
C. Baseline Approaches
To benchmark uncertainty-aware LPR, we adapt a number
of uncertainty estimation techniques existing in related fields
to the LPR setting.
Standard LPR Network: As explored in Sec. II-A, state-
of-the-art LPR techniques utilise a deep neural network to
reduce a point cloud to a descriptor d. Given a database
of Nprevious point clouds and locations, a standard LPR
network converts this to a database Dof N L-dimensional
descriptors, D={di∈RL}N
i=1. During evaluation, a query
point cloud Pq∈RM×3, with Mpoints, is reduced to a
query descriptor dq∈RL. This query descriptor is compared