Reliability-Aware Prediction for Person Image Retrieval 3
certainty is defined as the dispersion degree of the feature vectors caused by the
distribution of the network parameter. Third, the data uncertainty and model
uncertainty are jointly learned in a unified network, and they serve as two cri-
teria to assess whether the result is reliable: as shown in Fig. 1(b), if a query
image is high-quality (low data uncertainty) and the model is confident in its
prediction of the query image (low model uncertainty), the final result will be
assessed as reliable. Experiments under risk-controlled settings and multi-query
settings show the proposed reliability assessment is effective.
The major contributions are summarized as: (1) We propose an uncertainty-
aware learning (UAL) method that can provide reliability-aware predictions
for the ReID task. (2) We introduce a sampling-free data uncertainty learning
method, which can improve the representation by explicitly inhibiting the neg-
ative impact of low-quality samples during training without any external clues.
(3) We propose a unified network to jointly learn data uncertainty and model
uncertainty. As far as we know, this is the first work to apply data uncertainty
and model uncertainty to the ReID task simultaneously. (4) Experiments under
risk-controlled settings and multi-query settings show the reliability assessment
is effective. Our method also shows superior performance in single query settings.
2 Related Work
Person ReID. Person ReID aims to associate a target person across different
camera views. Existing methods can be broadly divided into two categories:
hand-craft methods [33, 51] and deep learning methods [30, 18, 20, 4, 15, 52]. The
key challenge is the large appearance variation caused by imperfect detection,
different camera views, poses, and occlusions. To remedy these issues, several
works [12, 13, 35, 47, 8, 62, 14, 32, 21] are proposed to learn local features to cope
with the appearance variation. Although these methods have played a certain
role, they are reliability-agnostic. That is, the model can output a prediction for
a probe anyway, but it does not describe the reliability of the prediction.
Uncertainty in person ReID. There are mainly two types of uncertainty:
data uncertainty and model uncertainty [7, 23, 40, 24]. Many tasks have consid-
ered the uncertainty to improve the robustness and interpretability of models,
such as face recognition [41, 25, 1], semantic segmentation [19, 24] and Multi-view
learning [10]. In the ReID task, prior arts [52, 55, 43, 22] consider data uncertainty
to alleviate the problem of label noise or data outliers. D-Net [52] maps each
person image as a Gaussian distribution in the latent space with the variance
indicating the data uncertainty. PUCNN [43] extends the data uncertainty in
D-Net into the part-level feature. UNRN [55] incorporates the uncertainty into
a teacher-student framework to evaluate the reliability of the predicted pseudo
labels for unsupervised domain adaptive (UDA) person ReID. The uncertainty is
estimated as the inconsistency of these two models in terms of their predicted soft
multi-labels. UMTS [22] designs an uncertainty-aware knowledge distillation loss
to transfer the knowledge of the multi-shots model into the single-shot model.
Among these methods, the most relevant method to ours is D-Net [52]. Com-