CONTI ROTA WANG RICCI CLUSTER-LEVEL PSEUDO-LABELLING 1 Cluster-level pseudo-labelling for source-free cross-domain facial expression

2025-04-27 0 0 2.33MB 13 页 10玖币
侵权投诉
CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING 1
Cluster-level pseudo-labelling for
source-free cross-domain facial expression
recognition
Alessandro Conti1
alessandro.conti-1@unitn.it
Paolo Rota1,2
paolo.rota@unitn.it
Yiming Wang3
ywang@fbk.eu
Elisa Ricci1,3
e.ricci@unitn.it
1DISI - Department of Information
Engineering and Computer Science
University of Trento
2CIMeC - Center for Mind and Brain
Sciences
University of Trento
3Fondazione Bruno Kessler (FBK)
Abstract
Automatically understanding emotions from visual data is a fundamental task for
human behaviour understanding. While models devised for Facial Expression Recognition
(FER) have demonstrated excellent performances on many datasets, they often suffer
from severe performance degradation when trained and tested on different datasets due
to domain shift. In addition, as face images are considered highly sensitive data, the
accessibility to large-scale datasets for model training is often denied. In this work, we
tackle the above-mentioned problems by proposing the first Source-Free Unsupervised
Domain Adaptation (SFUDA) method for FER. Our method exploits self-supervised
pretraining to learn good feature representations from the target data and proposes a novel
and robust cluster-level pseudo-labelling strategy that accounts for in-cluster statistics.
We validate the effectiveness of our method in four adaptation setups, proving that it
consistently outperforms existing SFUDA methods when applied to FER, and is on par
with methods addressing FER in the UDA setting.
Code is available at https://github.com/altndrr/clup.
1 Introduction
Facial Expression Recognition (FER) [
11
,
31
,
32
,
33
] refers to the task of automatically infer-
ring the emotional state of a person from a facial image, which supports multiple application
fields, such as assistive robotics and security monitoring. However, each individual shows their
emotional state differently according to their personal traits or complicated cultural/ethical
factors [
3
]. Such heterogeneity in the data space remains one of the main challenges for a
generalisable model for FER. In the last twenty years, the efforts to improve such technologies
have been mostly split between collecting larger and more diverse datasets [
29
,
37
] and
advancing learning algorithms for improving generalisation capability in the wild [
11
,
41
].
© 2022. The copyright of this document resides with its authors.
It may be distributed unchanged freely in print or electronic forms.
arXiv:2210.05246v1 [cs.CV] 11 Oct 2022
2CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING
Figure 1: Comparison between previous works (the left part) and our CluP on cross-domain
FER (the right part). Differently from past works, we aim to learn a target model
fT(·)
with
only source model
fS(·)
and unlabelled target data
{XT}
without the source data
{XS
,YS}
,
a very likely scenario due to privacy concerns. Our solution, CluP, is the first method on
source-free domain adaptation for FER, exploiting self-supervised learning (SSL) to warm up
the target feature extractor gT(·)and a novel cluster-level pseudo-labelling technique.
Many recent techniques for FER exploit the attention mechanism [
1
,
9
,
28
,
30
,
34
], while
some other works learn uncertainty via feature mixup [
41
], or improve feature representations
by replacing the pooling layers to reduce padding erosion [32].
Recent works often frame the problem from an Unsupervised Domain Adaptation (UDA)
perspective where labels of the target samples are not available [
14
,
20
,
21
]. For example,
in [
22
], Li et al. introduce a novel loss function to preserve feature locality despite the domain
shift. Such loss also organises facial expressions according to their intensity in the embedding
space. A more recent method [
6
] exploits facial landmarks and holistic features to adapt to
the target domain with adversarial learning applied on graphs.
While all these methods improve the adaptability of FER models across data distributions,
the source data is required during adaptation. However, when dealing with facial images, the
source data might not be available due to the increasingly stringent regulations concerning the
privacy of citizens. Therefore, we are motivated to address the more challenging problem of
Source-Free Unsupervised Domain Adaptation (SFUDA) for FER, given only the availability
of the source pretrained model (see Fig. 1). To the best of our knowledge, we are the first
to propose a domain adaptation solution for FER that works without the source facial data,
embracing a privacy-preserving learning paradigm as the source data can remain private.
Our proposed method, CluP (
Clu
ster-level
P
seudo-labelling), exploits self-supervised
learning (SSL) on the target data and proposes a novel cluster-level pseudo-labelling technique.
Pseudo-labelling for UDA often extends the source model to the target domain using the
source confidence to select the best target training inputs [
26
,
40
]. However, the computation
of confidence requires supervised training, which is only possible in UDA with the access
to the source data. In the case of SFUDA, as the domain gap increases, one can expect
a degrading representation capability of the source model on the target domain. Recent
advances in SSL shows that a good data representation can be learnt without annotated
labels [
5
,
7
,
12
]. In this work, we propose to exploit SSL techniques for a good starting
feature representation for the target model, and further propose to improve the reliability
of pseudo-labels with our newly introduced cluster purity,i.e. the local statistics of target
CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING 3
samples that are clustered within the feature space expressed by the source model. We validate
CluP on a set of cross-domain FER benchmarks and prove its advantageous performance in
terms of classification accuracy.
We summarise our contributions as follows:
We present CluP, the first method addressing Source-free Unsupervised Domain Adap-
tation for FER, exploiting SSL to foundation our target model.
CluP introduces a novel cluster-level pseudo-labelling scheme to improve the reliability
of pseudo-labels based on in-cluster attributes that deviates from traditional confidence-
based pseudo-labelling methods.
We demonstrate that CluP surpasses competing methods for SFUDA and is comparable
with UDA techniques on several FER adaptation benchmarks.
2 Related work
In the following, we present recent works on UDA methods for FER, and some general-
purpose SFUDA solutions.
Unsupervised Domain Adaptation for FER.
As a consequence of the domain bias, quite
prominent among FER datasets, some works focus on domain adaptation [
6
,
14
,
20
,
21
,
22
,
38
,
43
]. In [
14
], Ji et al. apply late fusion on the outputs of two channels that learn intra-
category and inter-category similarities of facial expressions. The authors of [
22
] introduce
a locality preserving loss that draws samples of the same class closer. They also notice
that neighbouring samples in the embedding space present similar emotional intensities.
DETN [
20
] applies two variations of the Maximum Mean Discrepancy to assess the amount
of divergence between the domains and to re-weight the class-wise source distribution to
match the target. The authors extend the work in [
21
], where they additionally consider the
differences in the conditional distributions. Differently from the above works, AGRA [
6
]
focuses on the well-established approach of adversarial domain adaptation, leveraging facial
landmarks alongside facial images. For the landmarks, they introduce two specialised graph
neural networks while jointly considering the domain feature distributions, the local features
(i.e. , landmarks), and the holistic features, achieving the best results on many benchmarks.
Compared to previous works, we consider a stricter setting where the source data is
unavailable. We argue that, due to privacy issues, human behaviour understanding methods
do not always have access to the source data. For this reason, we introduce a novel method
for FER that adapt to a target domain in a source-free manner.
Source-Free Unsupervised Domain Adaptation.
Recently, novel methods for source-free
domain adaptation have been proposed [
13
,
17
,
19
,
23
,
25
,
39
]. The setting represents a more
complex but realistic scenario of UDA, where source data is unavailable. Some works resort
to entropy-minimisation losses to adapt to the target domain without labels. For example,
SHOT [
23
] employs an entropy loss alongside a classification loss on pseudo-labelled samples
to adapt the network to the target domain. The work has been extended in [
24
] introducing
an auxiliary head that solves relative rotation, leading to improved performance. Differently
from the above, the authors of [
13
] frame the problem from an image translation perspective
and translate the target images to the source style using only the source model. In [
36
], they
perform self-training with a loss function that considers the intrinsic structure of the target
domain via nearest neighbours. In the proposed work, we do not impose any constraint on the
loss function, our refinement step works on source clusters, and we propose a novel score
function to select the best samples to train on the target domain. Other works address open-set
摘要:

CONTI,ROTA,WANG,RICCI:CLUSTER-LEVELPSEUDO-LABELLING1Cluster-levelpseudo-labellingforsource-freecross-domainfacialexpressionrecognitionAlessandroConti1alessandro.conti-1@unitn.itPaoloRota1,2paolo.rota@unitn.itYimingWang3ywang@fbk.euElisaRicci1,3e.ricci@unitn.it1DISI-DepartmentofInformationEngineering...

展开>> 收起<<
CONTI ROTA WANG RICCI CLUSTER-LEVEL PSEUDO-LABELLING 1 Cluster-level pseudo-labelling for source-free cross-domain facial expression.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:2.33MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注