An Effective Approach for Multi-label Classiﬁcation with Missing Labels 1stXin Zhang

2025-04-30 1 0 670.29KB 10 页 10玖币

侵权投诉

An Effective Approach for Multi-label

Classiﬁcation with Missing Labels

1st Xin Zhang

University of South Carolina

Columbia, United States

xz8@email.sc.edu

2nd Rabab Abdelfattah

University of South Carolina

Columbia, United States

rabab@email.sc.edu

3rd Yuqi Song

University of South Carolina

Columbia, United States

yuqis@email.sc.edu

4th Xiaofeng Wang

University of South Carolina

Columbia, United States

wangxi@cec.sc.edu

Abstract—Compared with multi-class classiﬁcation, multi-

label classiﬁcation that contains more than one class is more

suitable in real life scenarios. Obtaining fully labeled high-

quality datasets for multi-label classiﬁcation problems, how-

ever, is extremely expensive, and sometimes even infeasible,

with respect to annotation efforts, especially when the label

spaces are too large. This motivates the research on partial-

label classiﬁcation, where only a limited number of labels are

annotated and the others are missing. To address this problem,

we ﬁrst propose a pseudo-label based approach to reduce the

cost of annotation without bringing additional complexity to the

existing classiﬁcation networks. Then we quantitatively study

the impact of missing labels on the performance of classiﬁer.

Furthermore, by designing a novel loss function, we are able to

relax the requirement that each instance must contain at least

one positive label, which is commonly used in most existing

approaches. Through comprehensive experiments on three large-

scale multi-label image datasets, i.e. MS-COCO, NUS-WIDE,

and Pascal VOC12, we show that our method can handle the

imbalance between positive labels and negative labels, while

still outperforming existing missing-label learning approaches

in most cases, and in some cases even approaches with fully

labeled datasets.

Index Terms—Deep learning, multi-label classiﬁcation, miss-

ing label, pseudo label, label imbalance

I. INTRODUCTION

In deep learning, multi-class classiﬁcation is a common

problem where the goal is to classify a set of instances, each

associated with a unique class label from a set of disjoint

class labels. A generalized version of multi-class problem

is multi-label classiﬁcation [1], which allows the instances

to be associated with more than one class. It is more a

practical problem in real life because of the intrinsic multi-

label property of the physical world [2]: automatic driving

always needs to identify which objects are contained in the

current scene, such as cars, trafﬁc lights and pedestrians; CT

scan can detect a variety of possible lesions; a movie can

simultaneously belong to different categories, for instance.

Ideally, multi-label classiﬁcation is a form of supervised

learning [3], which requires lots of accurate labels. In practice,

however, annotating all labels for each training instance raises

a great challenge in multi-label classiﬁcation, which is time-

consuming and even impractical especially in the presence

of a large number of categories [4], [5]. Therefore, how to

leverage the performance of multi-label classiﬁer and the cost

TABLE I

DIFFERENT MISSING-LABEL SETTINGS.X,×,∅REPRESENT THAT

CURRENT INSTANCE BELONGS TO THIS CLASS,DOES NOT BELONG TO

THIS CLASS,AND LACKS RELATED LABEL,RESPECTIVELY.

Settings Class 1 Class 2 Class 3 Class 4 Class 5

FOL X×X×X

POL X∅ ∅ ×X

PPL X∅X∅ ∅

SPL X∅ ∅ ∅ ∅

of collecting labels receives signiﬁcant interests in recent

years.

The main strategies can be roughly divided into two

categories: (1) generating annotations automatically and (2)

training with missing labels. The former uses the web as

the supervisor to generate annotations [6]–[8], since there

is a large amount of imagery data with labeled information

available on the web, such as social media hashtags and

connections between web-pages and user feedback. However,

these methods may introduce additional noises to the label

space, which can degrade a classiﬁer’s performance. For the

latter, missing labels means that only a subset of all the

labels can be observed and the rest remains unknown. It

can be further divided into several representative settings:

fully observed labels (FOL), partially observed labels (POL),

which is the most common setting. Two variations of POL

include: partially observed positive labels (PPL) and single

positive label (SPL). Table I shows the difference between

these settings. It should be pointed out that POL setting

is more common than PPL in real life. For example, in

many execution records of industrial devices [9]–[11], the

probability of each component’s failure is extremely low.

Therefore, it is almost impossible to guarantee that each

instance corresponds to one positive label, let alone in the

setting of missing labels.

This paper focuses on multi-label classiﬁcation with miss-

ing labels. Although there has been a lot of work done along

this direction [5], [12], there are still some critical issues to

be addressed:

•To solve the multi-label classiﬁcation with missing la-

bels, many state-of-the-art (SOTA) methods [2] [4] rely

arXiv:2210.13651v1 [cs.CV] 24 Oct 2022

on additional structures, such as GNN and label estima-

tor, which further increase the complexity of networks.

A natural question is whether this problem can be

effectively solved without signiﬁcantly increasing the

network complexity.

•It is still not clear how the missing ratio of the labels

affects the classiﬁcation performance, which is of great

importance for us to balance the performance of classiﬁer

and the annotation cost.

•Due to imbalance between positive and negative labels,

most methods dealing with missing labels require that

there is at least one positive label per instance, i.e., PPL,

instead of POL, which is more common in real life.

With these observations, this paper investigates new ap-

proaches for multi-label classiﬁcation with missing labels.

The main contributions are summarized as follows:

•We propose a pseudo-label-based approach to predict

all possible categories with missing labels, which can

effectively balance the performance of classiﬁers and the

cost of annotation. The network structure in our approach

is the same as the classiﬁer trained with full labels, which

means that our approach will not increase the network

complexity. The major difference lies in the novel design

of loss functions and training schemes.

•We provide systematical and quantitative analysis of the

impact of labels’ missing ratio on the classiﬁer’s per-

formance. In particular, we relax the strict requirement

that the label space of each instance must contain at

least one positive label, which is commonly seen in the

related work [4], [5]. Therefore, our method is applicable

to general POL settings, not only PPL.

•Comprehensive experiments verify that our approach

can be effectively applied to missing-label classiﬁcation.

Speciﬁcally, our approach outperforms most existing

missing-label learning approaches, and in some cases

even approaches trained with fully labeled datasets. More

importantly, our approach can adopt POL settings, which

is incompatible with most existing methods.

The rest of the paper is organized as follows. Section II

discusses the related work. The problem is formulated in Sec-

tion III and our proposed method is presented in Section IV.

Section V shows the experimental results. Finally, conclusions

are drawn in Section VI.

II. RELATED WORK

A. Multi-label Learning with Missing Labels

Recently, numerous methods have been proposed for multi-

label classiﬁcation with missing labels. Herein, we brieﬂy

review the relevant studies.

Binary Relevance (BR). A straightforward approach for

multi-label learning with missing labels is BR [1], [13], which

decomposes the task into a number of binary classiﬁcation

problems, each for one label. Such an approach encounters

many difﬁculties, mainly due to ignoring correlations be-

tween labels. To address this issue, many correlation-enabling

extensions to binary relevance have been proposed [12],

[14]–[17]. However, most of these methods require solving

an optimization problem while keeping the training set in

memory at the same time. So it is extremely hard, if not

impossible, to apply a mini-batch strategy to ﬁne-tune the

model [2], which will limit the use of pre-trained neural

networks (NN) [18].

Positive and Unlabeled Learning (PU-learning). PU-learning

is an alternative solution [19], which studies the problem with

a small number of positive examples and a large number of

unlabeled examples for training. Most methods can be divided

into the following three categories: two-step techniques [20]–

[22], biased learning [23], [24], and class prior incorpora-

tion [25], [26]. All these methods require that the training

data consists of positive and unlabeled examples [27]. In

other words, they treat the negative labels as unlabeled, which

discard the existing negatives and does not make full use of

existing labels.

Pseudo Label. Pseudo-labeling was ﬁrst proposed in [28]. The

goal of pseudo-labeling is to generate pseudo-labels for unla-

beled samples [29]. There are different methods to generate

pseudo labels: the work in [28], [30] uses the predictions of

a trained NN to assign pseudo labels. Neighborhood graphs

are used in [31]. The approach in [32] updates pseudo labels

through an optimization framework. It is worth mention-

ing that MixMatch-family semi-supervised learning methods

[33]–[36] achieve SOTA on multi-class problem by utilizing

pseudo labels and consistency regularization [37]. However,

creation of negative pseudo-labels (i.e. labels which specify

the absence of speciﬁc classes) is not supported by these

methods, which therefore affects the performance of classiﬁer

by neglecting negative labels [30]. Instead, the work in [30]

obtains the reference values of pseudo labels directly from the

network predictions and then generates hard pseudo labels by

setting conﬁdence thresholds for positive and negative labels,

respectively. Different from [30], we simplify this process

by studying the proportion of positive and negative labels to

generate pseudo labels.

B. Imbalance

A key characteristic of multi-label classiﬁcation is the

inherent positive-negative imbalance created when the overall

number of labels is large [38]. Missing labels exacerbate the

imbalance and plague recognizing positives [5]. Therefore, the

work in [4], [5] mandates that each instance in the training

set must have at least one positive label, which means that

they focus on PPL setting instead of “real” POL. Obviously,

this assumption may not always hold in real life scenarios. To

relax this assumption, a trivial solution is to treat the instances

with only negative labels as unlabeled instances. In this case,

however, it wastes the value of negative labels.

In this work, we allow the instances in training sets

with only negative labels (that is POL setting). From this

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnEffectiveApproachforMulti-labelClassicationwithMissingLabels1stXinZhangUniversityofSouthCarolinaColumbia,UnitedStatesxz8@email.sc.edu2ndRababAbdelfattahUniversityofSouthCarolinaColumbia,UnitedStatesrabab@email.sc.edu3rdYuqiSongUniversityofSouthCarolinaColumbia,UnitedStatesyuqis@email.sc.edu4thXia...

展开>> 收起<<

An Effective Approach for Multi-label Classiﬁcation with Missing Labels 1stXin Zhang.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

An Effective Approach for Multi-label Classiﬁcation with Missing Labels 1stXin Zhang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: