TGDM Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning

2025-05-06 0 0 1.27MB 9 页 10玖币
侵权投诉
TGDM: Target Guided Dynamic Mixup for Cross-Domain
Few-Shot Learning
Linhai Zhuo
lhzhuo19@fudan.edu.cn
Shanghai Key Lab of Intell. Info.
Processing, School of CS, Fudan
University
Yuqian Fu
fuyq20@fudan.edu.cn
Shanghai Key Lab of Intell. Info.
Processing, School of CS, Fudan
University
Jingjing Chen#
chenjingjing@fudan.edu.cn
Shanghai Key Lab of Intell. Info.
Processing, School of CS, Fudan
University
Yixin Cao
caoyixin2011@gmail.com
Singapore Management University
Yu-Gang Jiang
ygj@fudan.edu.cn
Shanghai Key Lab of Intell. Info.
Processing, School of CS, Fudan
University
ABSTRACT
Given sucient training data on the source domain, cross-domain
few-shot learning (CD-FSL) aims at recognizing new classes with a
small number of labeled examples on the target domain. The key
to addressing CD-FSL is to narrow the domain gap and transfer-
ring knowledge of a network trained on the source domain to the
target domain. To help knowledge transfer, this paper introduces
an intermediate domain generated by mixing images in the source
and the target domain. Specically, to generate the optimal inter-
mediate domain for dierent target data, we propose a novel target
guided dynamic mixup (TGDM) framework that leverages the target
data to guide the generation of mixed images via dynamic mixup.
The proposed TGDM framework contains a Mixup-3T network for
learning classiers and a dynamic ratio generation network (DRGN)
for learning the optimal mix ratio. To better transfer the knowl-
edge, the proposed Mixup-3T network contains three branches
with shared parameters for classifying classes in the source domain,
target domain, and intermediate domain. To generate the optimal
intermediate domain, the DRGN learns to generate an optimal mix
ratio according to the performance on auxiliary target data. Then,
the whole TGDM framework is trained via bi-level meta-learning
so that TGDM can rectify itself to achieve optimal performance on
target data. Extensive experimental results on several benchmark
datasets verify the eectiveness of our method.
CCS CONCEPTS
Computing methodologies Image representations.
#indicates corresponding author.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
MM ’22, October 10–14, 2022, Lisboa, Portugal
©2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9203-7/22/10. . . $15.00
https://doi.org/10.1145/3503161.3548052
Figure 1: A pilot study on mixing source and target data with
various ratios. We use the proposed network Mixup-3T as
the base model. The Mini-Imagenet [28] and the Places [49]
are used as the source and target datasets, respectively.
KEYWORDS
cross-domain few-shot learning, dynamic mixup, target guided
learning, bi-level meta-learning
ACM Reference Format:
Linhai Zhuo, Yuqian Fu, Jingjing Chen
#
, Yixin Cao, and Yu-Gang Jiang
,
.
2022. TGDM: Target Guided Dynamic Mixup for Cross-Domain Few-Shot
Learning. In Proceedings of the 30th ACM International Conference on Multi-
media (MM ’22), October 10–14, 2022, Lisboa, Portugal. ACM, New York, NY,
USA, 9 pages. https://doi.org/10.1145/3503161.3548052
1 INTRODUCTION
Few-shot learning (FSL) aims to recognize unseen classes with
only a few labeled data. This not only helps to alleviate the heavy
burden of manual annotations, but also encourages the models to be
consistent with human cognitive processes — we can easily learn
new animals or car brands from a couple of examples. FSL will
fundamentally benet many tasks, ranging from object detection
[
4
] to video classication [
3
]. Typically, these tasks [
13
,
32
,
34
,
40
]
assume two types of classes: base classes with sucient data for
training, and novel classes with very few data for testing, where
base classes and novel classes have no overlap but follow the same
data distribution for transfer learning. Clearly, such a distribution
assumption is too strong in realistic. Most base and novel classes
are actually from dierent domains in practice. How to mitigate
arXiv:2210.05392v2 [cs.CV] 30 Nov 2022
MM ’22, October 10–14, 2022, Lisboa, Portugal Linhai Zhuo et al.
the domain gap for cross-domain few-shot learning (CD-FSL) has
attracted increasing attention recently [36, 37].
Building on top of FSL methods, the key challenge of CD-FSL is
to improve the model’s generalization ability. There are mainly two
groups of methods. The rst group has no access to the data in target
domain and relies on abstracting more discriminative features via
adversarial training or disentangle learning. For instance, Wang
and Deng
[42]
introduce adversarial task augmentation to improve
the robustness of the inductive bias across domains. Fu et al
. [10]
decompose the low-frequency and high-frequency components of
images to span the style distributions of the source domain.
However, the performances of the above methods are still unsatis-
factory. Thus, to achieve superior performance, some methods tend
to introduce target domain data. The basic idea is to mitigate the
domain gap through data augmentation. Except for source domain,
Das et al
. [6]
and Liang et al
. [22]
further ne-tune their models
on unlabeled data in target domain via self- or semi-supervised
methods, while paper [
8
] demonstrates the eectiveness of using
very few labeled target data. Considering the acceptable cost of
limited labeled data in practice, we advocate this direction.
In this paper, we propose to investigate the mixup technique
[
48
] to eciently use a small amount of labeled target domain data
during training for CD-FSL. Mixup is an easy-to-apply data aug-
mentation method. It conducts linear interpolation between source
and target domains of data. Thus the mixed data is intermediate
between source and target domains. We name it as an intermediate
domain throughout the paper. Clearly, training on the data from
this intermediate domain not only reconciles the dierent data
distribution from various domains, but also improves the model
generalization ability.
Although many works have demonstrated the eectiveness of
mixup in various tasks, focusing on CD-FSL, the severe data im-
balance issue brings a great challenge — how to set the ratio of
source domain data to target domain data. Focusing more on the
limited labeled data in target domain will fall into an over-tting
problem, and less on auxiliary target data may not be helpful in
domain adaption.
To show the great impacts of mix ratio on specic CD-FSL tasks,
we have conducted a pilot study by choosing the proposed base net-
work Mixup-3T as the model, and Mini-Imagenet [
28
] and Places
[
49
] as the datasets. As shown in Fig. 1, we can see a great uctuation
of accuracy along with the varying mix ratios. That is, an optimal
mixup strategy helps to achieve good performance. Since the op-
timal mix ratio could be dierent on dierent datasets and tasks,
it’s tedious to choose an optimal mix ratio manually. Therefore,
further investigation on mix ratio is necessary, and a well-designed
optimization strategy can be benecial here.
To address the challenges, we propose a novel
T
arget
G
uided
D
ynamic
M
ixup (TGDM) framework that controls the interme-
diate domain during training for CD-FSL. By dynamically choos-
ing a suitable mix ratio, TGDM boosts the performance on novel
classes, without the harm on base classes. There are two core com-
ponents: the classication network Mixup-3T and
D
ynamic
R
aito
G
eneration
N
etwork (DRGN). The basic idea lies in guided gener-
ating mix ratio and utilize intermediate domain eectively. First,
based on current mixed data, we optimize Mixup-3T via a tri-task
learning mechanism, involving source, target, and intermediate do-
main classication tasks. The source and target classication tasks
target better performance on specic tasks, and the intermediate
classication is to improve the generalization ability. Second, DRGN
learns to produce a target guided mix ratio to guide the generation
of intermediate domain data. Specically, We perform a pseudo
backward propagation of Mixup-3T to validate the performance on
auxiliary target data, whose loss will be utilized to update DRGN.
Overall, our contributions are summarized as follows: 1) We
propose a novel target guided dynamic mixup (TGDM) framework
that leverages the target data to dynamically control the mix ratio
for better generalization ability in cross-domain few-shot learning.
2) We propose a Mixup-3T network that utilizes the dynamic mixed
data as the intermediate domain for better transferring knowledge
between the source domain and the target domain. 3) We conduct
extensive experiments on several benchmarks, and the experimental
results demonstrate the eectiveness of our framework.
2 RELATED WORK
2.1 Few-Shot Learning
Few-shot learning aims at learning new concepts with very few
samples. Many eorts have been made in this eld. These methods
are mainly divided into three categories: model initialization [
7
,
30
],
metric-learning [
5
,
13
,
34
,
40
] and data augmentation [
5
,
9
,
11
,
21
].
More recently, Zhou et al
. [50]
apply Similarity Ratio to weight the
importance of base classes and thus select the optimal ones. Ji et al
.
[18]
propose Modal-Alternating Propagation Network to rectify
visual features with semantic class attributes. Yan et al
. [45]
adopt
bi-level meta-learning optimization framework to select samples.
These methods obtain training and testing images from the same
domain. In this paper, we stick to metric learning and bi-level meta-
learning but formulate them under the cross-domain scenario with
few labeled data.
2.2 Cross-Domain Few-Shot Learning
Cross-domain few-shot learning (CD-FSL) aims to perform few-
shot classication under the setting where the training and testing
data are from dierent domains. This task is formally dened and
proposed by [
37
]. Then, more benchmarks are proposed by [
15
].
According to whether using target dataset during the training phase
[
8
,
23
,
24
,
27
,
33
,
46
] or not [
10
,
22
,
37
], CD-FSL methods can be
divided into two groups. For training without target data, Wang and
Deng
[42]
apply adversarial training to augment data and improve
the robustness of the inductive bias. Fu et al
. [10]
believe that the
style contains domain specic information. So they transfer styles
between two training episodes and apply self-supervised learning
to make network ignore the transformation of style. Generally,
because of lacking target data, the performances of this kind of
methods are lower than that use target data in model training. As a
result, some researchers netune their models on target support
set. For example, Liang et al
. [22]
propose NSAE to enhance feature
with noise and Das et al
. [6]
apply contrastive loss. Both of these
two works eventually netune their models with support data
in target domains. [
24
,
27
,
46
] further introduce unlabeled data
and turn to additional self-supervised learning tasks on unlabeled
target data. Lin et al
. [23]
integrate several SOTA modules and
摘要:

TGDM:TargetGuidedDynamicMixupforCross-DomainFew-ShotLearningLinhaiZhuolhzhuo19@fudan.edu.cnShanghaiKeyLabofIntell.Info.Processing,SchoolofCS,FudanUniversityYuqianFufuyq20@fudan.edu.cnShanghaiKeyLabofIntell.Info.Processing,SchoolofCS,FudanUniversityJingjingChen#chenjingjing@fudan.edu.cnShanghaiKeyLab...

展开>> 收起<<
TGDM Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:9 页 大小:1.27MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注