ME-D2N Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning

2025-04-24 0 0 7.54MB 9 页 10玖币
侵权投诉
ME-D2N: Multi-Expert Domain Decompositional Network for
Cross-Domain Few-Shot Learning
Yuqian Fu
fuyq20@fudan.edu.cn
Shanghai Key Lab of Intelligent
Information Processing, School of
Computer Science, Fudan University
Yu Xie
Yanwei Fu
{yxie18, yanweifu}@fudan.edu.cn
School of Data Science, Fudan
University
Jingjing Chen
Yu-Gang Jiang#
{chenjingjing, ygj}@fudan.edu.cn
Shanghai Key Lab of Intelligent
Information Processing, School of
Computer Science, Fudan University
Source Dataset
(sufficient examples)
Auxiliary Target Dataset
(few examples)
Novel Target Dataset
(few examples)
transfer
disjoint
disjoint x
x
Challenge#1: Data Imbalance
Challenge#2: Learn from Different Domains
Solution#1: Multi-Expert Learning
Solution#2: Domain Decomposition
Source Data
Auxiliary Data
Source
Teacher
Auxiliary
Teacher
Student
Model
distill
distill
Student
Model
Domain Decomposition
Module
Decomposed
Student
classes
number of images
classes
number of images
5-way 1-shot episode testing task
Support
Images
Query Image
Figure 1: Illustration of our motivation and solutions. We observe two core challenges: 1) serious data imbalance problem
between the two training datasets; 2) network is required to learn from dierent domains simultaneously. Correspondingly,
we have the following key solutions: 1) proposing the multi-expert learning that rst learns two individual teacher models
and then transfers the knowledge to a student model via knowledge distillation; 2) presenting a novel domain decomposition
module that learns to decompose the network structure of student model into two domain-related sub-parts.
ABSTRACT
Recently, Cross-Domain Few-Shot Learning (CD-FSL) which aims
at addressing the Few-Shot Learning (FSL) problem across dier-
ent domains has attracted rising attention. The core challenge of
CD-FSL lies in the domain gap between the source and novel tar-
get datasets. Though many attempts have been made for CD-FSL
without any target data during model training, the huge domain
gap makes it still hard for existing CD-FSL methods to achieve
very satisfactory results. Alternatively, learning CD-FSL models
with few labeled target domain data which is more realistic and
promising is advocated in previous work [
13
]. Thus, in this paper,
we stick to this setting and technically contribute a novel Multi-
Expert Domain Decompositional Network (ME-D2N). Concretely,
to solve the data imbalance problem between the source data with
sucient examples and the auxiliary target data with limited ex-
amples, we build our model under the umbrella of multi-expert
indicates equal contributions, #indicates corresponding author.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
MM ’22, October 10–14, 2022, Lisboa, Portugal
©2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9203-7/22/10. . . $15.00
https://doi.org/10.1145/3503161.3547995
learning. Two teacher models which can be considered to be ex-
perts in their corresponding domain are rst trained on the source
and the auxiliary target sets, respectively. Then, the knowledge
distillation technique is introduced to transfer the knowledge from
two teachers to a unied student model. Taking a step further, to
help our student model learn knowledge from dierent domain
teachers simultaneously, we further present a novel domain decom-
position module that learns to decompose the student model into
two domain-related sub-parts. This is achieved by a novel domain-
specic gate that learns to assign each lter to only one specic
domain in a learnable way. Extensive experiments demonstrate
the eectiveness of our method. Codes and models are available at
https://github.com/lovelyqian/ME-D2N_for_CDFSL.
CCS CONCEPTS
Computing methodologies Computer vision
;Learning la-
tent representations.
KEYWORDS
cross-domain few-shot learning, classication for unbalanced data,
multi-expert learning, network decomposition.
ACM Reference Format:
Yuqian Fu
, Yu Xie
, Yanwei Fu, Jingjing Chen, and Yu-Gang Jiang#
,
. 2022.
ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain
Few-Shot Learning. In Proceedings of the 30th ACM International Conference
arXiv:2210.05280v1 [cs.CV] 11 Oct 2022
MM ’22, October 10–14, 2022, Lisboa, Portugal Yuqian Fu et al.
on Multimedia (MM ’22), October 10–14, 2022, Lisboa, Portugal. ACM, New
York, NY, USA, 9 pages. https://doi.org/10.1145/3503161.3547995
1 INTRODUCTION
FSL mainly aims at transferring knowledge from a source dataset
to a novel target dataset with only one or few labeled examples.
Generally, FSL assumes that the images of the source and target
datasets belong to the same domain. However, such an ideal as-
sumption may not be easy to be met in real-world multimedia
applications. For example, as revealed in [
9
], a model trained on the
Imagenet [
11
] which is mainly composed of massive and diverse
natural images still fails to recognize the novel ne-grained birds.
To this end, CD-FSL which is dedicated to addressing the domain
gap problem of FSL has invoked rising attention.
Recently, various settings of CD-FSL have been extensively stud-
ied in many previous methods [
13
,
14
,
33
,
40
,
44
]. Most of them [
14
,
40
,
44
] use only the source domain images for training and pay
eorts on improving the generalization ability of the FSL mod-
els. Though some achievements have been made, it is still hard to
achieve very impressive performance due to the huge domain gap
between the source and target datasets. Thus, some works [
13
,
33
]
relax the most basic yet strict setting, and allow target data to be
used during the training phase. More specically, STARTUP [
33
]
proposes to make use of relative massive unlabeled target data,
whilst Meta-FDMixup [
13
] advocates utilizing few limited labeled
target data. Unfortunately, the massive unlabeled examples in the
former one may still be not easy to be obtained in many real-world
applications, such as the recognition of endangered wild animals
and specic buildings. By contrast, learning CD-FSL with few lim-
ited labeled target domain data, e.g., 5 images per class, is more
realistic. Thus, in this paper, we stick to the setting proposed in
Meta-FDMixup [13] to promote the learning process of models.
Formally, given a source domain dataset with enough examples
and an auxiliary target domain dataset with only a few labeled ex-
amples, our goal is to learn a good FSL model taking these two sets
as training data and achieve good results on the novel target data.
Notably, as in Meta-FDMixup, our setting doesn’t violate the basic
FSL setting, as the class sets of the auxiliary target training data and
the novel target testing data are disjoint from each other strictly.
This ensures that none of the novel target categories will appear
during the training stage. Critically, as shown in Figure 1, we high-
light that there are two key challenges: 1) The number of labeled
examples for the source dataset and auxiliary target dataset are
extremely unbalanced. Models learned on such unbalanced training
data will be biased towards the source dataset while performing
much worse on the target dataset. 2) Since the source dataset and
the auxiliary target belong to two distinct domains, it may be too
dicult for a single model to learn knowledge from datasets with
dierent domains simultaneously. Such challenges unfortunately
have less been touched in previous works [13].
To address these challenges, this paper presents a novel
M
ulti-
E
xpert
D
omain
D
ecompositional
N
etwork (
ME-D2N
) for CD-FSL.
Our key solutions are also illustrated in Figure 1. Specically, taking
unbalanced datasets as training data will leads to the model biased
problem [
2
,
49
]. That is, the learned model tends to perform well on
the classes with more examples but has a performance degradation
on the categories with fewer examples. To tackle the data imbal-
ance issue, we propose to build our model upon the multi-expert
learning paradigm. Concretely, rather than learning a model on
the merged data of source and auxiliary target datasets directly, we
train two teacher models on the source and the auxiliary dataset,
respectively. Models trained in this way can be considered experts
in their specialized domain avoiding being aected by training
data of another domain. Then, we transfer the knowledge from
these two teachers to our student model. This is done by using
the knowledge distillation technique which constrains the student
model to produce consistent predictions with the teachers. By dis-
tilling the individual knowledge from both source and target teacher
models, our student model picks up the ability to recognize both
the source and auxiliary target images, avoiding learning from the
unbalanced datasets. We take one step further: considering that
forcing a unied model to learn from teachers of dierent domains
may be nontrivial. Concretely, since each lter in the network needs
to be responsible for extracting all domain features simultaneously,
this vanilla learning method may limit the performance of the net-
work. A natural question is whether it is possible to decompose the
student model into two parts – one for learning from the source
teacher and the another for the auxiliary target teacher? Based on
the above insights, a novel domain decomposition module which
is also termed as D2N is proposed. Specically, our D2N aims at
building a one-to-one correspondence between the network lters
and the domains. That is, each lter is only assigned to be activated
by one specic domain. Technically, we achieve this by proposing a
novel domain-specic gate that learns the activation state of lters
for a specic domain dynamically. We insert the D2N into the fea-
ture extractor of the student model and make it learnable together
with the model parameters.
We conduct extensive experiments on four dierent target datasets.
Results well indicate that our multi-expert learning strategy helps
address the data imbalance problem. Besides, our D2N further im-
proves the performance of the student model showing the advan-
tages of decomposing the student model into two domains.
Contributions.
We summarize our contributions as below: 1) For
the rst time, we introduce the multi-expert learning paradigm
into the task of CD-FSL with few labeled target data to prevent the
model from learning on unbalanced datasets directly. By learning
from two teachers, we avoid our model being biased towards the
source dataset with signicantly more samples. 2) A novel domain
decomposition module (D2N) is proposed to learn to decompose the
model’s lters into the source and target domain-specic parts. The
concept of domain decomposition has less been explored in previous
work, especially for the task of CD-FSL. 3) Extensive experiments
conducted show the eectiveness of our modules and our proposed
full model ME-D2N builds a new state of the art.
2 RELATED WORK
Cross-Domain Few-Shot Learning.
Recent study [
9
] nds that
most of the existing FSL methods [
12
,
15
,
17
,
28
,
37
,
37
,
39
,
41
43
,
46
,
52
54
] that assume the source and target datasets belong
to the same distribution fail to generalize to novel datasets with a
domain gap. Thus, CD-FSL which aims at addressing FSL across
dierent domains has risen increasing attentions [
4
,
13
,
14
,
18
,
24
,
摘要:

ME-D2N:Multi-ExpertDomainDecompositionalNetworkforCross-DomainFew-ShotLearningYuqianFu∗fuyq20@fudan.edu.cnShanghaiKeyLabofIntelligentInformationProcessing,SchoolofComputerScience,FudanUniversityYuXie∗YanweiFu{yxie18,yanweifu}@fudan.edu.cnSchoolofDataScience,FudanUniversityJingjingChenYu-GangJiang#{c...

展开>> 收起<<
ME-D2N Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:7.54MB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注