Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery Wenbin Li1 Zhichen Fan1 Jing Huo1 Yang Gao1 1State Key Laboratory for Novel Software Technology Nanjing University China

2025-05-06 0 0 2.73MB 11 页 10玖币
侵权投诉
Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery
Wenbin Li1, Zhichen Fan1, Jing Huo1*
, Yang Gao1
1State Key Laboratory for Novel Software Technology, Nanjing University, China
Abstract
Novel class discovery (NCD) aims at learning a model
that transfers the common knowledge from a class-disjoint
labelled dataset to another unlabelled dataset and discov-
ers new classes (clusters) within it. Many methods, as well
as elaborate training pipelines and appropriate objectives,
have been proposed and considerably boosted performance
on NCD tasks. Despite all this, we find that the existing
methods do not sufficiently take advantage of the essence
of the NCD setting. To this end, in this paper, we pro-
pose to model both inter-class and intra-class constraints
in NCD based on the symmetric Kullback-Leibler diver-
gence (sKLD). Specifically, we propose an inter-class sKLD
constraint to effectively exploit the disjoint relationship be-
tween labelled and unlabelled classes, enforcing the sep-
arability for different classes in the embedding space. In
addition, we present an intra-class sKLD constraint to ex-
plicitly constrain the intra-relationship between a sample
and its augmentations and ensure the stability of the train-
ing process at the same time. We conduct extensive exper-
iments on the popular CIFAR10, CIFAR100 and ImageNet
benchmarks and successfully demonstrate that our method
can establish a new state of the art and can achieve signif-
icant performance improvements, e.g., 3.5%/3.7% cluster-
ing accuracy improvements on CIFAR100-50 dataset split
under the task-aware/-agnostic evaluation protocol, over
previous state-of-the-art methods. Code is available at
https://github.com/FanZhichen/NCD-IIC.
1. Introduction
Deep learning has made great progress and achieved re-
markable results in many computer vision fields, especially
in image classification [13,16,22,25,27]. Unfortunately,
these successes of deep learning heavily rely on a large
amount of fully labelled data for training. On the other
hand, in many realistic scenarios, it is difficult to collect or
to annotate such a large-scale dataset. To address this prob-
lem, a new paradigm of novel class discovery (NCD) has
*Corresponding author
been proposed and attracted increasing attention in recent
years [8,10,11,33,35,37].
The goal of NCD is to train a classification model on a
labelled dataset and simultaneously transfer the latent com-
mon knowledge to discover new classes (or clusters) in an-
other unlabelled dataset. Different from semi-supervised
learning [1,2,26,36] that assumes the labelled and unla-
belled datasets share the same label space, in the setting
of NCD, the classes of the unlabelled dataset are disjoint
with those of the labelled dataset, which is more challeng-
ing. In addition, NCD is also different from the generic
clustering [3,4,18,31] in that an additional labelled dataset
is available in NCD. In general, for the standard cluster-
ing methods, the clustering results are not unique. That is
to say, there may be multiple different and approximately
correct results for a certain unlabelled dataset. In contrast,
thanks to the available labelled dataset, NCD can eliminate
the semantic ambiguity with the label guidance and finally
makes the clustering be consistent with the real visual se-
mantics [11]. Clearly, NCD is more realistic and more prac-
tical than unsupervised clustering.
In general, the existing NCD methods can be roughly
divided into two categories, i.e., two-stage based methods
and single-stage based methods [20]. Most of the early
methods are two-stage by using labelled and unlabelled
data in different stages, such as KCL [14], MCL [15] and
DTC [11]. Typically, the two-stage based methods first
learn an embedding network on the labelled set through su-
pervised learning and then use it on the unlabelled set to
discover new clusters with little modifications. In contrast,
the latest methods are almost single-stage, such as, RS [10],
NCL [37], UNO [8], DualRank [35] and ComEx [33],
which use both labelled and unlabelled data in a single
stage at the same time. The single-stage based methods can
learn the feature representation and discover novel classes
simultaneously, iteratively updating the learned feature em-
bedding network and clustering results during the train-
ing process. Compared with the two-stage based meth-
ods, the single-stage methods can make more effective use
of the similarity between labelled and unlabelled classes
to achieve a better knowledge transfer between these two
datasets. Therefore, in this paper, we will mainly focus on
1
arXiv:2210.03591v3 [cs.CV] 23 Mar 2023
the single-stage direction of NCD.
However, we find that the current single-stage based
NCD methods do not sufficiently take advantage of the
essence of the NCD setting, that is to say, overlooking the
disjoint characteristic between the labelled and unlabelled
classes. In this sense, on one hand, the labelled and unla-
belled samples cannot be effectively separated, weakening
the discriminability of the learned features. On the other
hand, because the labelled data is learned under supervision
while the unlabelled data has no supervision, i.e., an im-
balanced learning process (learning with different supervi-
sion strengths), it will make the learned feature representa-
tions biased toward the labelled data. In addition, we notice
that although some methods [10,35] have used data aug-
mentation to generate additional samples and gained sig-
nificant performance improvements, they generally employ
the mean squared error (MSE) as the consistency regular-
ization, which cannot constrain the consistency well with a
good generalization ability.
To address the above two issues, we propose to model
both Inter-class and Intra-class Constraints (IIC for short)
built on the symmetric Kullback-Leibler divergence (sKLD)
for discovering the novel classes. To be specific, an inter-
class sKLD constraint is proposed to explicitly learn to sep-
arate different classes between labelled and unlabelled data,
enhancing the discriminability of learned feature represen-
tations. Moreover, an intra-class sKLD constraint is pre-
sented to fully learn the intra-relationship between sam-
ples and their augmentations. According to our experi-
ments, such an intra-class sKLD constraint can also stable
the training process in the training phase. We have con-
ducted extensive experiments on three benchmarks, includ-
ing CIFAR10 [21], CIFAR100 [21] and ImageNet [7], and
show that the proposed two constraints can significantly and
consistently outperform the existing novel class discovery
methods by a large margin.
To summarize, our contributions are as follows:
We propose a new inter-class Kullback-Leibler diver-
gence constraint to sufficiently model the relationship
between the labelled and unlabelled datasets to learn
more discriminative feature representations, which is
somewhat overlooked in the literature.
We propose a new intra-class Kullback-Leibler diver-
gence constraint to effectively exploit the relationship
between a sample and its different transformations to
learn invariant feature representations.
We evaluate the proposed constraints on three bench-
mark datasets for novel class discovery and obtain sig-
nificant performance improvements over the state-of-
the-art methods, which successfully demonstrates the
effectiveness of the proposed method.
2. Related Work
Novel class discovery (NCD) is a new task attracted wide
attention in recent years, which aims at discovering new
classes in an unlabelled dataset given a class-disjoint la-
belled dataset as supervision. A variety of advanced NCD
methods have been proposed and have tangibly improved
the clustering performance on multiple benchmark datasets.
The early methods of NCD include KCL [14], MCL [15]
and DTC [11]. In general, these methods first learn an em-
bedding network of feature representations on the labelled
data, and then use it directly for the unlabelled data. Specif-
ically, KCL and MCL propose a framework for both cross-
domain and cross-task transfer learning that leverages the
pairwise similarity to represent categorical information, and
learn the clustering network based on the pairwise simi-
larity prediction through different objective functions, re-
spectively. DTC extends the deep embedding clustering
method [31] into a transfer learning setting and proposes
a two-stage method. Importantly, Han et al. [11] formalize
the task of novel class discovery for the first time.
Since then, the current NCD methods [8,10,19,20,33,
35,37,38] are almost single-stage and can take greater ad-
vantage of both labelled and unlabelled data. RS [10] in-
troduces a three-step learning pipeline, which first trains
the representation network with all labelled and unlabelled
samples using self-supervised learning, and then uses rank-
ing statistics to obtain pairwise similarity between unla-
belled samples, and finally use the pairwise similarity to
discover novel classes. DualRank [35] expands RS to a
two-branch framework from both global and local levels.
Similarly, DualRank uses dual ranking statistics and mutual
knowledge distillation to generate pseudo labels and ensure
the consistency between two branches. In order to gener-
ate pairwise pseudo labels, Joint [19] employs a Winner-
Take-All (WTA) hashing algorithm [32] on the shared fea-
ture space for NCD.
NCL [37] and OpenMix [38] are largely motivated by
contrastive learning [5,12] and Mixup [34], respectively.
NCL introduces the contrastive loss to learn more discrim-
inative representations. On the other hand, OpenMix uses
Mixup to mix labelled and unlabelled samples, building a
learnable relationship between the two parts of data. In-
stead of using multiple objectives, UNO [8] introduces a
unified objective function to transfer knowledge from the la-
belled set to unlabelled set. More recently, Joseph et al. [20]
categorize the existing NCD methods into two classes (i.e.,
two- and single-stage based methods), according to whether
the labelled and unlabelled samples are available at the
same time or not. They also propose a spacing loss to en-
force separability between labelled and unlabelled points in
the embedding space. ComEx [33] focuses on the gener-
alized NCD (GNCD), aka generalized category discovery
(GCD) [29], and proposes two groups of compositional ex-
2
perts to solve this problem.
To our best knowledge, the existing approaches do not
make full use of the disjoint characteristic between labelled
and unlabelled classes. In addition, we find that some meth-
ods utilizes the mean squared error (MSE) to constrain the
learned representations of data augmentation, but it can-
not achieve the desired effect. Instead, we propose inter-
class and intra-class constraints based on the symmetric
Kullback-Leibler divergence (sKLD) for NCD.
3. Method
3.1. Problem Formulation
In the NCD setting, given a labelled dataset Dl=
{(xl
1, yl
1),...,(xl
N, yl
N)}, the goal is to automatically dis-
cover Cuclusters (or classes) in an unlabelled dataset Du=
{xu
1,...,xu
M}, where each xl
iin Dlor xu
iin Duis an
image and yl
i∈ Y ={1, . . . , Cl}is the corresponding
class label of xl
i. In particular, we assume that the set of
Cllabelled classes is disjoint with the set of Cuunlabelled
classes. In this sense, the core of NCD is how to effectively
learn transferable semantic knowledge from the disjoint la-
belled dataset Dlto help performing clustering on the un-
labelled dataset Du. Following the literature [8,10,37], we
also assume the number of unlabelled classes Cuis known
a priori in this paper.
To tackle this challenge problem, we propose two sym-
metric Kullback-Leibler divergence (sKLD) based con-
straints from both inter-class and intra-class perspectives to
learn more discriminative feature representations for NCD
models (see Fig. 1). In the following sections, we first intro-
duce the inter-class sKLD constraint and intra-class sKLD
constraint in Sec. 3.2, and then summarize the overall ob-
jective for training in Sec. 3.3.
3.2. Symmetric Kullback-Leibler Divergence for
Novel Class Discovery
According to the above analyses, to effectively utilize
the two parts of data in NCD, i.e., a labelled set and an
unlabelled set, we develop two inter-class and intra-class
symmetric Kullback-Leibler divergence (sKLD) constraints
to better accomplish the NCD task, especially in the single-
stage paradigm.
Following UNO [8], as shown in Fig. 1, the architec-
ture of our model consists of two parts: an encoder Eand
two classification heads hand g. The encoder Eis imple-
mented as a standard convolutional neural network (CNN),
which converts an input images into a feature vector. The
head hbelonging to the labelled data is implemented as
a linear classifier with Cloutput neurons, and the head g
belonging to the unlabelled data is composed of a multi-
layer perceptron (MLP) and a linear classifier with Cuout-
put neurons. In the training phase, each sample xiwill




Shared
Shared
Figure 1. Architecture of the proposed method. We present the
“raw samples” (labelled sample xl
i, unlabelled sample xu
j, corre-
sponding logits and probability distributions) in blue and the “aug-
mented counterparts” (labelled counterpart ˆ
xl
i, unlabelled coun-
terpart ˆ
xu
j, corresponding logits and probability distributions) in
green. The samples and their augmentations are inputted into the
shared encoder E, and then fed into two classification heads hand
g, obtaining predictions to calculate both inter-class sKLD Linter
and intra-class sKLD Lintra. For brevity, we omit the calculation
process of the standard cross-entropy loss LCE.
be first encoded as a feature vector by E, and then will
be passed through both classification heads to obtain the
corresponding logits lihRCland ligRCu, respec-
tively. After that, the two logits are concatenated together
as li= [lih,lig]RCl+Cuand fed into a softmax layer σ
with a temperature τ, obtaining the probability distribution
pi=σ(li).
Inter-class Symmetric KLD Constraint. When solv-
ing the NCD problem, in the pipeline of the single-stage
based methods, both labelled and unlabelled images will
be accessed in each mini-batch during training. Although
the distributions of these two parts of images are similar, in
fact, the representations of them should be different from
each other as much as possible, i.e., the separability be-
tween the labelled and unlabelled classes. However, this
point is somewhat overlooked in the existing single-stage
based methods. Therefore, to address this issue, we pro-
pose an inter-class sKLD constraint to explicitly enlarge
the distance between each labelled sample and each unla-
belled sample in the current mini-batch using a symmetric
Kullback-Leibler divergence distance. The formulation be-
tween a pair of samples is as follows:
LsKLD =1
2DKL(pl
i||pu
j) + DKL(pu
j||pl
i),(1)
where pl
iand pu
jare the probability distributions generated
for the labelled image xl
i|N
i=1 and unlabelled image xu
j|M
j=1
in a mini-batch, respectively. DKL is the Kullback-Leibler
3
摘要:

ModelingInter-ClassandIntra-ClassConstraintsinNovelClassDiscoveryWenbinLi1,ZhichenFan1,JingHuo1*,YangGao11StateKeyLaboratoryforNovelSoftwareTechnology,NanjingUniversity,ChinaAbstractNovelclassdiscovery(NCD)aimsatlearningamodelthattransfersthecommonknowledgefromaclass-disjointlabelleddatasettoanother...

展开>> 收起<<
Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery Wenbin Li1 Zhichen Fan1 Jing Huo1 Yang Gao1 1State Key Laboratory for Novel Software Technology Nanjing University China.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:2.73MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注