An Effective Approach for Multi-label Classification with Missing Labels 1stXin Zhang

2025-04-30 0 0 670.29KB 10 页 10玖币
侵权投诉
An Effective Approach for Multi-label
Classification with Missing Labels
1st Xin Zhang
University of South Carolina
Columbia, United States
xz8@email.sc.edu
2nd Rabab Abdelfattah
University of South Carolina
Columbia, United States
rabab@email.sc.edu
3rd Yuqi Song
University of South Carolina
Columbia, United States
yuqis@email.sc.edu
4th Xiaofeng Wang
University of South Carolina
Columbia, United States
wangxi@cec.sc.edu
Abstract—Compared with multi-class classification, multi-
label classification that contains more than one class is more
suitable in real life scenarios. Obtaining fully labeled high-
quality datasets for multi-label classification problems, how-
ever, is extremely expensive, and sometimes even infeasible,
with respect to annotation efforts, especially when the label
spaces are too large. This motivates the research on partial-
label classification, where only a limited number of labels are
annotated and the others are missing. To address this problem,
we first propose a pseudo-label based approach to reduce the
cost of annotation without bringing additional complexity to the
existing classification networks. Then we quantitatively study
the impact of missing labels on the performance of classifier.
Furthermore, by designing a novel loss function, we are able to
relax the requirement that each instance must contain at least
one positive label, which is commonly used in most existing
approaches. Through comprehensive experiments on three large-
scale multi-label image datasets, i.e. MS-COCO, NUS-WIDE,
and Pascal VOC12, we show that our method can handle the
imbalance between positive labels and negative labels, while
still outperforming existing missing-label learning approaches
in most cases, and in some cases even approaches with fully
labeled datasets.
Index Terms—Deep learning, multi-label classification, miss-
ing label, pseudo label, label imbalance
I. INTRODUCTION
In deep learning, multi-class classification is a common
problem where the goal is to classify a set of instances, each
associated with a unique class label from a set of disjoint
class labels. A generalized version of multi-class problem
is multi-label classification [1], which allows the instances
to be associated with more than one class. It is more a
practical problem in real life because of the intrinsic multi-
label property of the physical world [2]: automatic driving
always needs to identify which objects are contained in the
current scene, such as cars, traffic lights and pedestrians; CT
scan can detect a variety of possible lesions; a movie can
simultaneously belong to different categories, for instance.
Ideally, multi-label classification is a form of supervised
learning [3], which requires lots of accurate labels. In practice,
however, annotating all labels for each training instance raises
a great challenge in multi-label classification, which is time-
consuming and even impractical especially in the presence
of a large number of categories [4], [5]. Therefore, how to
leverage the performance of multi-label classifier and the cost
TABLE I
DIFFERENT MISSING-LABEL SETTINGS.X,×,REPRESENT THAT
CURRENT INSTANCE BELONGS TO THIS CLASS,DOES NOT BELONG TO
THIS CLASS,AND LACKS RELATED LABEL,RESPECTIVELY.
Settings Class 1 Class 2 Class 3 Class 4 Class 5
FOL X×X×X
POL X∅ ∅ ×X
PPL XX∅ ∅
SPL X∅ ∅ ∅ ∅
of collecting labels receives significant interests in recent
years.
The main strategies can be roughly divided into two
categories: (1) generating annotations automatically and (2)
training with missing labels. The former uses the web as
the supervisor to generate annotations [6]–[8], since there
is a large amount of imagery data with labeled information
available on the web, such as social media hashtags and
connections between web-pages and user feedback. However,
these methods may introduce additional noises to the label
space, which can degrade a classifier’s performance. For the
latter, missing labels means that only a subset of all the
labels can be observed and the rest remains unknown. It
can be further divided into several representative settings:
fully observed labels (FOL), partially observed labels (POL),
which is the most common setting. Two variations of POL
include: partially observed positive labels (PPL) and single
positive label (SPL). Table I shows the difference between
these settings. It should be pointed out that POL setting
is more common than PPL in real life. For example, in
many execution records of industrial devices [9]–[11], the
probability of each component’s failure is extremely low.
Therefore, it is almost impossible to guarantee that each
instance corresponds to one positive label, let alone in the
setting of missing labels.
This paper focuses on multi-label classification with miss-
ing labels. Although there has been a lot of work done along
this direction [5], [12], there are still some critical issues to
be addressed:
To solve the multi-label classification with missing la-
bels, many state-of-the-art (SOTA) methods [2] [4] rely
arXiv:2210.13651v1 [cs.CV] 24 Oct 2022
on additional structures, such as GNN and label estima-
tor, which further increase the complexity of networks.
A natural question is whether this problem can be
effectively solved without significantly increasing the
network complexity.
It is still not clear how the missing ratio of the labels
affects the classification performance, which is of great
importance for us to balance the performance of classifier
and the annotation cost.
Due to imbalance between positive and negative labels,
most methods dealing with missing labels require that
there is at least one positive label per instance, i.e., PPL,
instead of POL, which is more common in real life.
With these observations, this paper investigates new ap-
proaches for multi-label classification with missing labels.
The main contributions are summarized as follows:
We propose a pseudo-label-based approach to predict
all possible categories with missing labels, which can
effectively balance the performance of classifiers and the
cost of annotation. The network structure in our approach
is the same as the classifier trained with full labels, which
means that our approach will not increase the network
complexity. The major difference lies in the novel design
of loss functions and training schemes.
We provide systematical and quantitative analysis of the
impact of labels’ missing ratio on the classifier’s per-
formance. In particular, we relax the strict requirement
that the label space of each instance must contain at
least one positive label, which is commonly seen in the
related work [4], [5]. Therefore, our method is applicable
to general POL settings, not only PPL.
Comprehensive experiments verify that our approach
can be effectively applied to missing-label classification.
Specifically, our approach outperforms most existing
missing-label learning approaches, and in some cases
even approaches trained with fully labeled datasets. More
importantly, our approach can adopt POL settings, which
is incompatible with most existing methods.
The rest of the paper is organized as follows. Section II
discusses the related work. The problem is formulated in Sec-
tion III and our proposed method is presented in Section IV.
Section V shows the experimental results. Finally, conclusions
are drawn in Section VI.
II. RELATED WORK
A. Multi-label Learning with Missing Labels
Recently, numerous methods have been proposed for multi-
label classification with missing labels. Herein, we briefly
review the relevant studies.
Binary Relevance (BR). A straightforward approach for
multi-label learning with missing labels is BR [1], [13], which
decomposes the task into a number of binary classification
problems, each for one label. Such an approach encounters
many difficulties, mainly due to ignoring correlations be-
tween labels. To address this issue, many correlation-enabling
extensions to binary relevance have been proposed [12],
[14]–[17]. However, most of these methods require solving
an optimization problem while keeping the training set in
memory at the same time. So it is extremely hard, if not
impossible, to apply a mini-batch strategy to fine-tune the
model [2], which will limit the use of pre-trained neural
networks (NN) [18].
Positive and Unlabeled Learning (PU-learning). PU-learning
is an alternative solution [19], which studies the problem with
a small number of positive examples and a large number of
unlabeled examples for training. Most methods can be divided
into the following three categories: two-step techniques [20]–
[22], biased learning [23], [24], and class prior incorpora-
tion [25], [26]. All these methods require that the training
data consists of positive and unlabeled examples [27]. In
other words, they treat the negative labels as unlabeled, which
discard the existing negatives and does not make full use of
existing labels.
Pseudo Label. Pseudo-labeling was first proposed in [28]. The
goal of pseudo-labeling is to generate pseudo-labels for unla-
beled samples [29]. There are different methods to generate
pseudo labels: the work in [28], [30] uses the predictions of
a trained NN to assign pseudo labels. Neighborhood graphs
are used in [31]. The approach in [32] updates pseudo labels
through an optimization framework. It is worth mention-
ing that MixMatch-family semi-supervised learning methods
[33]–[36] achieve SOTA on multi-class problem by utilizing
pseudo labels and consistency regularization [37]. However,
creation of negative pseudo-labels (i.e. labels which specify
the absence of specific classes) is not supported by these
methods, which therefore affects the performance of classifier
by neglecting negative labels [30]. Instead, the work in [30]
obtains the reference values of pseudo labels directly from the
network predictions and then generates hard pseudo labels by
setting confidence thresholds for positive and negative labels,
respectively. Different from [30], we simplify this process
by studying the proportion of positive and negative labels to
generate pseudo labels.
B. Imbalance
A key characteristic of multi-label classification is the
inherent positive-negative imbalance created when the overall
number of labels is large [38]. Missing labels exacerbate the
imbalance and plague recognizing positives [5]. Therefore, the
work in [4], [5] mandates that each instance in the training
set must have at least one positive label, which means that
they focus on PPL setting instead of “real” POL. Obviously,
this assumption may not always hold in real life scenarios. To
relax this assumption, a trivial solution is to treat the instances
with only negative labels as unlabeled instances. In this case,
however, it wastes the value of negative labels.
In this work, we allow the instances in training sets
with only negative labels (that is POL setting). From this
摘要:

AnEffectiveApproachforMulti-labelClassicationwithMissingLabels1stXinZhangUniversityofSouthCarolinaColumbia,UnitedStatesxz8@email.sc.edu2ndRababAbdelfattahUniversityofSouthCarolinaColumbia,UnitedStatesrabab@email.sc.edu3rdYuqiSongUniversityofSouthCarolinaColumbia,UnitedStatesyuqis@email.sc.edu4thXia...

展开>> 收起<<
An Effective Approach for Multi-label Classification with Missing Labels 1stXin Zhang.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:670.29KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注