CD-FSOD A BENCHMARK FOR CROSS-DOMAIN FEW-SHOT OBJECT DETECTION Wuti Xiong Center for Machine Vision and Signal Analysis University of Oulu Finland

2025-04-27 0 0 2.25MB 5 页 10玖币
侵权投诉
CD-FSOD: A BENCHMARK FOR CROSS-DOMAIN FEW-SHOT OBJECT DETECTION
Wuti Xiong
Center for Machine Vision and Signal Analysis, University of Oulu, Finland
wuti.xiong@oulu.fi
ABSTRACT
In this paper, we propose a study of the cross-domain few-
shot object detection (CD-FSOD) benchmark, consisting of
image data from a diverse data domain. On the proposed
benchmark, we evaluate state-of-art FSOD approaches, in-
cluding meta-learning FSOD approaches and fine-tuning
FSOD approaches. The results show that these methods tend
to fall, and even underperform the naive fine-tuning model.
We analyze the reasons for their failure and introduce a strong
baseline that uses a mutually-beneficial manner to alleviate
the overfitting problem. Our approach is remarkably superior
to existing approaches by significant margins (2.0% on av-
erage) on the proposed benchmark. Our code is available at
https://github.com/FSOD/CD-FSOD.
Index TermsFew-shot Object Detection, Cross-domain.
1. INTRODUCTION
Few-shot object detection (FSOD) aims to detect novel
classes of objects with a few annotated instances. In the
previous FSOD setting [1,2], a detector is pre-training on the
source dataset consisting of base classes and then transferred
into the target dataset consisting of novel classes with few
instances, where base classes and novel classes are disjoint
but share similar data domains. However, this underlying
assumption does not apply to some real-world scenarios be-
cause it is difficult or impossible to collect a sufficient amount
of data in these domains. This leads to a new FSOD prob-
lem, where the detector must resort to pre-training in the
base classes from a different domain. In these cases, even
humans have trouble recognizing new categories that vary
too greatly between examples or differ from prior experience
[3,4]. Thus, finding new approaches to tackle the problem
remains a challenging but desirable goal.
Although conventional FSOD benchmarks [1,2] are well
established, no works study FSOD across different domains.
To fill this gap, In this paper, we introduce the study of Cross-
Domain Few-Shot Object Detection (CD-FSOD) benchmark
(As shown in Figure 1), which covers three target datasets:
ArTaxOr [6], UODD [7] and DIOR [8]. On the proposed
benchmark, we conduct extensive experiments to evaluate ex-
isting FSOD approaches (including meta-learning approaches
Source Domain
Decreasing similarity to MS COCO
Target Domains (Disjoint label spaces)
MS COCO ArTaxOr UODD DIOR
Fig. 1: The CD-FSOD benchmark. MS COCO [5] is used
for source training, and domains of varying dissimilarity from
natural images are used for target evaluation.
[2,9,10] and fine-tuning approaches [11,12,13]). The results
show that existing FSOD approaches can not achieve satis-
factory performance and even underperform the naive fine-
tuning model due to freezing parameters. Even without freez-
ing parameters, fine-tuning methods struggle to outperform
the naive transfer model while meta-learning methods still
fail. This finding shows that existing FSOD methods cannot
work for CD-FSOD, and there is an urgent need to develop
new methods.
Besides, we introduce a novel distillation-based baseline,
which enable a “flywheel effect” that the student and teacher
can mutually reinforce each other so that both get better and
better as the training goes on. Specifically, EMA (Exponential
Moving Average) enables the teacher model to ensemble the
student models in different time steps. The student’s weights
are optimized by the distillation loss between the pseudo-
labels generated by the teacher and the predictions by the stu-
dents on the same image. Our approach outperforms existing
FSOD approaches by a large margin on the proposed bench-
mark. In summary, our main contributions are as follows:
(1)we established the CD-FSOD benchmark, where there is
a very large domain difference between the base and target
datasets; 2) on the proposed benchmark, we evaluate existing
FSOD approaches, and analyze the reasons for their failure;
3) we introduce a strong baseline that achieves state-of-the-art
performance on the proposed benchmark.
arXiv:2210.05311v3 [cs.CV] 3 May 2023
摘要:

CD-FSOD:ABENCHMARKFORCROSS-DOMAINFEW-SHOTOBJECTDETECTIONWutiXiongCenterforMachineVisionandSignalAnalysis,UniversityofOulu,Finlandwuti.xiong@oulu.fiABSTRACTInthispaper,weproposeastudyofthecross-domainfew-shotobjectdetection(CD-FSOD)benchmark,consistingofimagedatafromadiversedatadomain.Ontheproposedbe...

展开>> 收起<<
CD-FSOD A BENCHMARK FOR CROSS-DOMAIN FEW-SHOT OBJECT DETECTION Wuti Xiong Center for Machine Vision and Signal Analysis University of Oulu Finland.pdf

共5页,预览1页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:5 页 大小:2.25MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 5
客服
关注