Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration

2025-05-02 0 0 1.45MB 10 页 10玖币
侵权投诉
Tackling Instance-Dependent Label Noise with Dynamic
Distribution Calibration
Manyi Zhang
zhang-my21@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Yuxin Ren
ryx20@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Zihao Wang
wangziha21@mails.tsinghua.edu.cn
SIGS, Tsinghua University
Shenzhen, China
Chun Yuan
yuanc@sz.tsinghua.edu.cn
SIGS, Tsinghua University
Peng Cheng National Laboratory
Shenzhen, China
ABSTRACT
Instance-dependent label noise is realistic but rather challenging,
where the label-corruption process depends on instances directly.
It causes a severe distribution shift between the distributions of
training and test data, which impairs the generalization of trained
models. Prior works put great eort into tackling the issue. Unfor-
tunately, these works always highly rely on strong assumptions or
remain heuristic without theoretical guarantees. In this paper, to
address the distribution shift in learning with instance-dependent
label noise, a dynamic distribution-calibration strategy is adopted.
Specically, we hypothesize that, before training data are corrupted
by label noise, each class conforms to a multivariate Gaussian dis-
tribution at the feature level. Label noise produces outliers to shift
the Gaussian distribution. During training, to calibrate the shifted
distribution, we propose two methods based on the mean and co-
variance of multivariate Gaussian distribution respectively. The
mean-based method works in a recursive dimension-reduction man-
ner for robust mean estimation, which is theoretically guaranteed
to train a high-quality model against label noise. The covariance-
based method works in a distribution disturbance manner, which
is experimentally veried to improve the model robustness. We
demonstrate the utility and eectiveness of our methods on datasets
with synthetic label noise and real-world unknown noise.
CCS CONCEPTS
Computing methodologies Computer vision; Machine
learning.
KEYWORDS
Instance-dependent label noise, distribution shift, distribution cali-
bration, robustness
Corresponding author.
This work is licensed under a Creative Commons Attribution
International 4.0 License.
MM ’22, October 10–14, 2022, Lisboa, Portugal
©2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9203-7/22/10.
https://doi.org/10.1145/3503161.3547984
ACM Reference Format:
Manyi Zhang, Yuxin Ren, Zihao Wang, and Chun Yuan. 2022. Tackling
Instance-Dependent Label Noise with Dynamic Distribution Calibration. In
Proceedings of the 30th ACM International Conference on Multimedia (MM
’22), Oct. 10–14, 2022, Lisboa, Portugal. ACM, New York, NY, USA, 10 pages.
https://doi.org/10.1145/3503161.3547984
1 INTRODUCTION
Learning with label noise is one of the hottest topics in weakly-
supervised learning [
7
,
11
,
36
]. In real life, large-scale datasets are
likely to contain label noise. The main reason is that manual high-
quality labeling is expensive [
42
,
43
,
59
]. Large-scale datasets are
always collected from crowdsourcing platforms [
33
] or crawled
from the internet [67], which inevitably introduces label noise.
Instance-dependent label noise [
5
,
6
,
63
] is more realistic and
applicable than instance-independent label noise, where the label-
ipping process depends on instances/features directly. It is be-
cause, in real-world scenes, an instance whose features contain less
discriminative information or are of poorer quality may be more
likely to be mislabeled. Instance-dependent label noise is more
challenging owing to its inherent complexity [
76
]. Compared with
instance-independent label noise, it leads to a more severe distribu-
tion shift problem for trained models [
2
]. That is to say, this kind of
noise makes the distributions of training and test data signicantly
dierent. If the models are trained with instance-dependent label
noise, it is pessimistic that they would generalize poorly on test
data [62].
The recent methods on handling instance-dependent label noise
are generally divided into two main categories. The rst one is
to estimate the instance-dependent transition matrix [
20
,
60
,
63
],
which tends to characterize the behaviors of clean labels ipping
into noisy labels. However, these methods are limited to the case
with a small number of classes [
64
,
76
]. Besides, they highly rely on
strong assumptions to achieve an accurate estimation, e.g., the as-
sumptions on anchor points, bounded noise rates, and extra trusted
data. It is hard or even infeasible to check these assumptions, which
hinders the validity of these methods [
20
]. The second one tends to
heuristically identify clean data based on the memorization eect
of deep models [
1
] for subsequent operations, e.g., label correc-
tion [
52
,
52
,
75
]. Unfortunately, due to the complexity of instance-
dependent label noise, label correction will be much weaker in the
arXiv:2210.05126v1 [cs.LG] 11 Oct 2022
MM ’22, October 10–14, 2022, Lisboa, Portugal Manyi Zhang, Yuxin Ren, Zihao Wang, and Chun Yuan
Figure 1: The illustrations of the distribution-shift problem
and our solution. The background color areas represent the
data distributions of dierent classes. (a) This represents
clean data and ground-truth data distributions. (b) Due to
label corruption, partial data are mislabeled, resulting in
noisy labels. The distributions trained on data with noisy
labels mismatch to ground-truth distributions. (c) Label cor-
rection mainly corrects data far from decision boundaries.
However, the mislabeled data near decision boundaries are
still not corrected, making learned distributions biased. (d)
By using our distribution calibration, the learned distribu-
tions are closer to ground-truth distributions, naturally fol-
lowing better generalization abilities.
noisy region near the decision boundary. Therefore, the labels cor-
rected by the current predictions would likely be erroneous [
2
]. In
addition, the training data in identied clean regions in this way is
relatively monotonous [
45
,
62
]. The corresponding distribution will
be restricted to a small region of the global distribution, which in-
troduces covariate shift
1
[
48
]. We detail the above issues in Figure 1.
Therefore, the methods belonging to both two categories cannot
well handle the distribution shift brought by instance-dependent
label noise.
In this paper, to address the above issues, we propose a dy-
namic distribution-calibration strategy. We rst assume that, before
training data are corrupted by label noise, each class conforms
to a multivariate Gaussian distribution at the feature level. Such
an assumption is more reasonable and has been veried in lots
of works, such as [
14
,
21
]. Then, two methods based on the mean
and covariance of multivariate Gaussian distributions are proposed.
Specically, the mean-based method works in a recursive dimension-
reduction manner, which is theoretically guaranteed to train a high-
quality model against instance-dependent label noise. It rst assigns
smaller weights to outliers and then divides the whole feature space
into clean space and corrupted space where the contamination has
larger eects on corrupted space. We recurse the computation on
corrupted space for robust mean estimation. The covariance-based
1
Covariate shift is a subclass of distribution shift. It is when the distribution of instances
shifts between training and test environments. Although the instance distribution may
change, the labels remain the same.
Figure 2: The noise level curve of PMD noise. The red curve
represents the upper bound. The noise level is bounded by 𝜌
in the restricted region while unbounded in the unrestricted
region. The blue curve shows the actual noise level.
method works in a distribution disturbance manner, which is exper-
imentally veried. It introduces an interference in the empirical
covariance of given data. In this way, we can increase the diver-
sity of training data to mitigate the distribution shift and improve
generalization. After achieving multivariate Gaussian distributions
for all classes, we sample examples from them for training, which
calibrates the shifted distributions.
We conduct extensive experiments across various settings on
CIFAR-10,CIFAR-100,WebVision, and Clothing1M. The results con-
sistently exhibit substantial performance improvements compared
to state-of-the-art methods, which support our claims well.
Organization.
The rest of this paper is organized as follows. In
Section 2, we introduce some background knowledge. In Section 3,
we present our methods step by step, with theoretical justications.
In Section 4, empirical evaluations are provided. In Section 5, we
summarize this paper.
2 PRELIMINARIES
Notations.
Let
[A]
be the indicator of the event
A
. Let
[𝑧]=
{
0
, . . . , 𝑧
1
}
. Besides,
|B|
denotes the total number of elements in
the set B.
Problem setup.
We consider a
𝑘
-classication problem (
𝑘
2).
Let
X
and
Y=[𝑘]
be the instance and class label spaces respec-
tively. We assume the dataset
{(𝒙𝑖, 𝑦𝑖)}𝑛
𝑖=1
is sampled from the
underlying joint distribution
D
over
X × Y
, where
𝑛
is the sam-
ple size. Before observation, partial clean labels are ipped due to
instance-dependent label noise. As a result, we are provided with
a noisy training dataset
{(𝒙𝑖,˜
𝑦𝑖)}𝑛
𝑖=1
obtained from a noisy joint
distribution
˜
D
over
X × Y
. For each instance
𝒙𝑖
, its label
˜
𝑦𝑖
may
be incorrect. Our goal is to learn a robust classier by only exploit-
ing the noisy dataset, which can assign clean labels to test data
precisely.
Label noise model.
In this paper, we consider polynomial-margin
diminishing noise (
PMD
noise) [
73
]. The PMD noise is one of the
instance-dependent label noise, which is realistic but rather chal-
lenging. Although the noise setting and analyses naturally gen-
eralize to the multi-class case (
𝑘>
2), for simplicity, we rst
focus on the binary case (
𝑘=
2) to improve legibility. Speci-
cally, let
𝜂(𝒙)=(𝑦=
1
|𝒙)
be the clean class posterior. Let
𝜌0,1(𝒙)=[˜
𝑦=
1
|𝑦=
0
,𝒙]
and
𝜌1,0(𝒙)=[˜
𝑦=
0
|𝑦=
1
,𝒙]
摘要:

TacklingInstance-DependentLabelNoisewithDynamicDistributionCalibrationManyiZhangzhang-my21@mails.tsinghua.edu.cnSIGS,TsinghuaUniversityShenzhen,ChinaYuxinRenryx20@mails.tsinghua.edu.cnSIGS,TsinghuaUniversityShenzhen,ChinaZihaoWangwangziha21@mails.tsinghua.edu.cnSIGS,TsinghuaUniversityShenzhen,ChinaC...

展开>> 收起<<
Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:1.45MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注