
Tackling Instance-Dependent Label Noise with Dynamic
Distribution Calibration
Manyi Zhang
zhang-my21@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Yuxin Ren
ryx20@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Zihao Wang
wangziha21@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Chun Yuan∗
yuanc@sz.tsinghua.edu.cn
SIGS, Tsinghua University
Peng Cheng National Laboratory
Shenzhen, China
ABSTRACT
Instance-dependent label noise is realistic but rather challenging,
where the label-corruption process depends on instances directly.
It causes a severe distribution shift between the distributions of
training and test data, which impairs the generalization of trained
models. Prior works put great eort into tackling the issue. Unfor-
tunately, these works always highly rely on strong assumptions or
remain heuristic without theoretical guarantees. In this paper, to
address the distribution shift in learning with instance-dependent
label noise, a dynamic distribution-calibration strategy is adopted.
Specically, we hypothesize that, before training data are corrupted
by label noise, each class conforms to a multivariate Gaussian dis-
tribution at the feature level. Label noise produces outliers to shift
the Gaussian distribution. During training, to calibrate the shifted
distribution, we propose two methods based on the mean and co-
variance of multivariate Gaussian distribution respectively. The
mean-based method works in a recursive dimension-reduction man-
ner for robust mean estimation, which is theoretically guaranteed
to train a high-quality model against label noise. The covariance-
based method works in a distribution disturbance manner, which
is experimentally veried to improve the model robustness. We
demonstrate the utility and eectiveness of our methods on datasets
with synthetic label noise and real-world unknown noise.
CCS CONCEPTS
•Computing methodologies →Computer vision; Machine
learning.
KEYWORDS
Instance-dependent label noise, distribution shift, distribution cali-
bration, robustness
∗Corresponding author.
This work is licensed under a Creative Commons Attribution
International 4.0 License.
MM ’22, October 10–14, 2022, Lisboa, Portugal
©2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9203-7/22/10.
https://doi.org/10.1145/3503161.3547984
ACM Reference Format:
Manyi Zhang, Yuxin Ren, Zihao Wang, and Chun Yuan. 2022. Tackling
Instance-Dependent Label Noise with Dynamic Distribution Calibration. In
Proceedings of the 30th ACM International Conference on Multimedia (MM
’22), Oct. 10–14, 2022, Lisboa, Portugal. ACM, New York, NY, USA, 10 pages.
https://doi.org/10.1145/3503161.3547984
1 INTRODUCTION
Learning with label noise is one of the hottest topics in weakly-
supervised learning [
7
,
11
,
36
]. In real life, large-scale datasets are
likely to contain label noise. The main reason is that manual high-
quality labeling is expensive [
42
,
43
,
59
]. Large-scale datasets are
always collected from crowdsourcing platforms [
33
] or crawled
from the internet [67], which inevitably introduces label noise.
Instance-dependent label noise [
5
,
6
,
63
] is more realistic and
applicable than instance-independent label noise, where the label-
ipping process depends on instances/features directly. It is be-
cause, in real-world scenes, an instance whose features contain less
discriminative information or are of poorer quality may be more
likely to be mislabeled. Instance-dependent label noise is more
challenging owing to its inherent complexity [
76
]. Compared with
instance-independent label noise, it leads to a more severe distribu-
tion shift problem for trained models [
2
]. That is to say, this kind of
noise makes the distributions of training and test data signicantly
dierent. If the models are trained with instance-dependent label
noise, it is pessimistic that they would generalize poorly on test
data [62].
The recent methods on handling instance-dependent label noise
are generally divided into two main categories. The rst one is
to estimate the instance-dependent transition matrix [
20
,
60
,
63
],
which tends to characterize the behaviors of clean labels ipping
into noisy labels. However, these methods are limited to the case
with a small number of classes [
64
,
76
]. Besides, they highly rely on
strong assumptions to achieve an accurate estimation, e.g., the as-
sumptions on anchor points, bounded noise rates, and extra trusted
data. It is hard or even infeasible to check these assumptions, which
hinders the validity of these methods [
20
]. The second one tends to
heuristically identify clean data based on the memorization eect
of deep models [
1
] for subsequent operations, e.g., label correc-
tion [
52
,
52
,
75
]. Unfortunately, due to the complexity of instance-
dependent label noise, label correction will be much weaker in the
arXiv:2210.05126v1 [cs.LG] 11 Oct 2022