TGDM Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning

2025-05-06 0 0 1.27MB 9 页 10玖币

侵权投诉

TGDM: Target Guided Dynamic Mixup for Cross-Domain

Few-Shot Learning

Linhai Zhuo

lhzhuo19@fudan.edu.cn

Shanghai Key Lab of Intell. Info.

Processing, School of CS, Fudan

University

Yuqian Fu

fuyq20@fudan.edu.cn

Shanghai Key Lab of Intell. Info.

Processing, School of CS, Fudan

University

Jingjing Chen#

chenjingjing@fudan.edu.cn

Shanghai Key Lab of Intell. Info.

Processing, School of CS, Fudan

University

Yixin Cao

caoyixin2011@gmail.com

Singapore Management University

Yu-Gang Jiang

ygj@fudan.edu.cn

Shanghai Key Lab of Intell. Info.

Processing, School of CS, Fudan

University

ABSTRACT

Given sucient training data on the source domain, cross-domain

few-shot learning (CD-FSL) aims at recognizing new classes with a

small number of labeled examples on the target domain. The key

to addressing CD-FSL is to narrow the domain gap and transfer-

ring knowledge of a network trained on the source domain to the

target domain. To help knowledge transfer, this paper introduces

an intermediate domain generated by mixing images in the source

and the target domain. Specically, to generate the optimal inter-

mediate domain for dierent target data, we propose a novel target

guided dynamic mixup (TGDM) framework that leverages the target

data to guide the generation of mixed images via dynamic mixup.

The proposed TGDM framework contains a Mixup-3T network for

learning classiers and a dynamic ratio generation network (DRGN)

for learning the optimal mix ratio. To better transfer the knowl-

edge, the proposed Mixup-3T network contains three branches

with shared parameters for classifying classes in the source domain,

target domain, and intermediate domain. To generate the optimal

intermediate domain, the DRGN learns to generate an optimal mix

ratio according to the performance on auxiliary target data. Then,

the whole TGDM framework is trained via bi-level meta-learning

so that TGDM can rectify itself to achieve optimal performance on

target data. Extensive experimental results on several benchmark

datasets verify the eectiveness of our method.

CCS CONCEPTS

•Computing methodologies →Image representations.

#indicates corresponding author.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

MM ’22, October 10–14, 2022, Lisboa, Portugal

ACM ISBN 978-1-4503-9203-7/22/10. . . $15.00

https://doi.org/10.1145/3503161.3548052

Figure 1: A pilot study on mixing source and target data with

various ratios. We use the proposed network Mixup-3T as

the base model. The Mini-Imagenet [28] and the Places [49]

are used as the source and target datasets, respectively.

KEYWORDS

cross-domain few-shot learning, dynamic mixup, target guided

learning, bi-level meta-learning

ACM Reference Format:

Linhai Zhuo, Yuqian Fu, Jingjing Chen

, Yixin Cao, and Yu-Gang Jiang

2022. TGDM: Target Guided Dynamic Mixup for Cross-Domain Few-Shot

Learning. In Proceedings of the 30th ACM International Conference on Multi-

media (MM ’22), October 10–14, 2022, Lisboa, Portugal. ACM, New York, NY,

USA, 9 pages. https://doi.org/10.1145/3503161.3548052

1 INTRODUCTION

Few-shot learning (FSL) aims to recognize unseen classes with

only a few labeled data. This not only helps to alleviate the heavy

burden of manual annotations, but also encourages the models to be

consistent with human cognitive processes — we can easily learn

new animals or car brands from a couple of examples. FSL will

fundamentally benet many tasks, ranging from object detection

[

] to video classication [

]. Typically, these tasks [

]

assume two types of classes: base classes with sucient data for

training, and novel classes with very few data for testing, where

base classes and novel classes have no overlap but follow the same

data distribution for transfer learning. Clearly, such a distribution

assumption is too strong in realistic. Most base and novel classes

are actually from dierent domains in practice. How to mitigate

arXiv:2210.05392v2 [cs.CV] 30 Nov 2022

MM ’22, October 10–14, 2022, Lisboa, Portugal Linhai Zhuo et al.

the domain gap for cross-domain few-shot learning (CD-FSL) has

attracted increasing attention recently [36, 37].

Building on top of FSL methods, the key challenge of CD-FSL is

to improve the model’s generalization ability. There are mainly two

groups of methods. The rst group has no access to the data in target

domain and relies on abstracting more discriminative features via

adversarial training or disentangle learning. For instance, Wang

and Deng

[42]

introduce adversarial task augmentation to improve

the robustness of the inductive bias across domains. Fu et al

. [10]

decompose the low-frequency and high-frequency components of

images to span the style distributions of the source domain.

However, the performances of the above methods are still unsatis-

factory. Thus, to achieve superior performance, some methods tend

to introduce target domain data. The basic idea is to mitigate the

domain gap through data augmentation. Except for source domain,

Das et al

. [6]

and Liang et al

. [22]

further ne-tune their models

on unlabeled data in target domain via self- or semi-supervised

methods, while paper [

] demonstrates the eectiveness of using

very few labeled target data. Considering the acceptable cost of

limited labeled data in practice, we advocate this direction.

In this paper, we propose to investigate the mixup technique

[

] to eciently use a small amount of labeled target domain data

during training for CD-FSL. Mixup is an easy-to-apply data aug-

mentation method. It conducts linear interpolation between source

and target domains of data. Thus the mixed data is intermediate

between source and target domains. We name it as an intermediate

domain throughout the paper. Clearly, training on the data from

this intermediate domain not only reconciles the dierent data

distribution from various domains, but also improves the model

generalization ability.

Although many works have demonstrated the eectiveness of

mixup in various tasks, focusing on CD-FSL, the severe data im-

balance issue brings a great challenge — how to set the ratio of

source domain data to target domain data. Focusing more on the

limited labeled data in target domain will fall into an over-tting

problem, and less on auxiliary target data may not be helpful in

domain adaption.

To show the great impacts of mix ratio on specic CD-FSL tasks,

we have conducted a pilot study by choosing the proposed base net-

work Mixup-3T as the model, and Mini-Imagenet [

] and Places

[

] as the datasets. As shown in Fig. 1, we can see a great uctuation

of accuracy along with the varying mix ratios. That is, an optimal

mixup strategy helps to achieve good performance. Since the op-

timal mix ratio could be dierent on dierent datasets and tasks,

it’s tedious to choose an optimal mix ratio manually. Therefore,

further investigation on mix ratio is necessary, and a well-designed

optimization strategy can be benecial here.

To address the challenges, we propose a novel

arget

uided

ynamic

ixup (TGDM) framework that controls the interme-

diate domain during training for CD-FSL. By dynamically choos-

ing a suitable mix ratio, TGDM boosts the performance on novel

classes, without the harm on base classes. There are two core com-

ponents: the classication network Mixup-3T and

ynamic

aito

eneration

etwork (DRGN). The basic idea lies in guided gener-

ating mix ratio and utilize intermediate domain eectively. First,

based on current mixed data, we optimize Mixup-3T via a tri-task

learning mechanism, involving source, target, and intermediate do-

main classication tasks. The source and target classication tasks

target better performance on specic tasks, and the intermediate

classication is to improve the generalization ability. Second, DRGN

learns to produce a target guided mix ratio to guide the generation

of intermediate domain data. Specically, We perform a pseudo

backward propagation of Mixup-3T to validate the performance on

auxiliary target data, whose loss will be utilized to update DRGN.

Overall, our contributions are summarized as follows: 1) We

propose a novel target guided dynamic mixup (TGDM) framework

that leverages the target data to dynamically control the mix ratio

for better generalization ability in cross-domain few-shot learning.

2) We propose a Mixup-3T network that utilizes the dynamic mixed

data as the intermediate domain for better transferring knowledge

between the source domain and the target domain. 3) We conduct

extensive experiments on several benchmarks, and the experimental

results demonstrate the eectiveness of our framework.

2 RELATED WORK

2.1 Few-Shot Learning

Few-shot learning aims at learning new concepts with very few

samples. Many eorts have been made in this eld. These methods

are mainly divided into three categories: model initialization [

metric-learning [

] and data augmentation [

More recently, Zhou et al

. [50]

apply Similarity Ratio to weight the

importance of base classes and thus select the optimal ones. Ji et al

[18]

propose Modal-Alternating Propagation Network to rectify

visual features with semantic class attributes. Yan et al

. [45]

adopt

bi-level meta-learning optimization framework to select samples.

These methods obtain training and testing images from the same

domain. In this paper, we stick to metric learning and bi-level meta-

learning but formulate them under the cross-domain scenario with

few labeled data.

2.2 Cross-Domain Few-Shot Learning

Cross-domain few-shot learning (CD-FSL) aims to perform few-

shot classication under the setting where the training and testing

data are from dierent domains. This task is formally dened and

proposed by [

]. Then, more benchmarks are proposed by [

According to whether using target dataset during the training phase

[

] or not [

], CD-FSL methods can be

divided into two groups. For training without target data, Wang and

Deng

[42]

apply adversarial training to augment data and improve

the robustness of the inductive bias. Fu et al

. [10]

believe that the

style contains domain specic information. So they transfer styles

between two training episodes and apply self-supervised learning

to make network ignore the transformation of style. Generally,

because of lacking target data, the performances of this kind of

methods are lower than that use target data in model training. As a

result, some researchers netune their models on target support

set. For example, Liang et al

. [22]

propose NSAE to enhance feature

with noise and Das et al

. [6]

apply contrastive loss. Both of these

two works eventually netune their models with support data

in target domains. [

] further introduce unlabeled data

and turn to additional self-supervised learning tasks on unlabeled

target data. Lin et al

. [23]

integrate several SOTA modules and

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TGDM:TargetGuidedDynamicMixupforCross-DomainFew-ShotLearningLinhaiZhuolhzhuo19@fudan.edu.cnShanghaiKeyLabofIntell.Info.Processing,SchoolofCS,FudanUniversityYuqianFufuyq20@fudan.edu.cnShanghaiKeyLabofIntell.Info.Processing,SchoolofCS,FudanUniversityJingjingChen#chenjingjing@fudan.edu.cnShanghaiKeyLab...

展开>> 收起<<

TGDM Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

TGDM Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: