CONTI ROTA WANG RICCI CLUSTER-LEVEL PSEUDO-LABELLING 1 Cluster-level pseudo-labelling for source-free cross-domain facial expression

2025-04-27 0 0 2.33MB 13 页 10玖币

侵权投诉

CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING 1

Cluster-level pseudo-labelling for

source-free cross-domain facial expression

recognition

Alessandro Conti1

alessandro.conti-1@unitn.it

Paolo Rota1,2

paolo.rota@unitn.it

Yiming Wang3

ywang@fbk.eu

Elisa Ricci1,3

e.ricci@unitn.it

1DISI - Department of Information

Engineering and Computer Science

University of Trento

2CIMeC - Center for Mind and Brain

Sciences

University of Trento

3Fondazione Bruno Kessler (FBK)

Abstract

Automatically understanding emotions from visual data is a fundamental task for

human behaviour understanding. While models devised for Facial Expression Recognition

(FER) have demonstrated excellent performances on many datasets, they often suffer

from severe performance degradation when trained and tested on different datasets due

to domain shift. In addition, as face images are considered highly sensitive data, the

accessibility to large-scale datasets for model training is often denied. In this work, we

tackle the above-mentioned problems by proposing the ﬁrst Source-Free Unsupervised

Domain Adaptation (SFUDA) method for FER. Our method exploits self-supervised

pretraining to learn good feature representations from the target data and proposes a novel

and robust cluster-level pseudo-labelling strategy that accounts for in-cluster statistics.

We validate the effectiveness of our method in four adaptation setups, proving that it

consistently outperforms existing SFUDA methods when applied to FER, and is on par

with methods addressing FER in the UDA setting.

Code is available at https://github.com/altndrr/clup.

1 Introduction

Facial Expression Recognition (FER) [

] refers to the task of automatically infer-

ring the emotional state of a person from a facial image, which supports multiple application

ﬁelds, such as assistive robotics and security monitoring. However, each individual shows their

emotional state differently according to their personal traits or complicated cultural/ethical

factors [

]. Such heterogeneity in the data space remains one of the main challenges for a

generalisable model for FER. In the last twenty years, the efforts to improve such technologies

have been mostly split between collecting larger and more diverse datasets [

] and

advancing learning algorithms for improving generalisation capability in the wild [

It may be distributed unchanged freely in print or electronic forms.

arXiv:2210.05246v1 [cs.CV] 11 Oct 2022

2CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING

Figure 1: Comparison between previous works (the left part) and our CluP on cross-domain

FER (the right part). Differently from past works, we aim to learn a target model

fT(·)

with

only source model

fS(·)

and unlabelled target data

{XT}

without the source data

{XS

,YS}

a very likely scenario due to privacy concerns. Our solution, CluP, is the ﬁrst method on

source-free domain adaptation for FER, exploiting self-supervised learning (SSL) to warm up

the target feature extractor gT(·)and a novel cluster-level pseudo-labelling technique.

Many recent techniques for FER exploit the attention mechanism [

], while

some other works learn uncertainty via feature mixup [

], or improve feature representations

by replacing the pooling layers to reduce padding erosion [32].

Recent works often frame the problem from an Unsupervised Domain Adaptation (UDA)

perspective where labels of the target samples are not available [

]. For example,

in [

], Li et al. introduce a novel loss function to preserve feature locality despite the domain

shift. Such loss also organises facial expressions according to their intensity in the embedding

space. A more recent method [

] exploits facial landmarks and holistic features to adapt to

the target domain with adversarial learning applied on graphs.

While all these methods improve the adaptability of FER models across data distributions,

the source data is required during adaptation. However, when dealing with facial images, the

source data might not be available due to the increasingly stringent regulations concerning the

privacy of citizens. Therefore, we are motivated to address the more challenging problem of

Source-Free Unsupervised Domain Adaptation (SFUDA) for FER, given only the availability

of the source pretrained model (see Fig. 1). To the best of our knowledge, we are the ﬁrst

to propose a domain adaptation solution for FER that works without the source facial data,

embracing a privacy-preserving learning paradigm as the source data can remain private.

Our proposed method, CluP (

Clu

ster-level

seudo-labelling), exploits self-supervised

learning (SSL) on the target data and proposes a novel cluster-level pseudo-labelling technique.

Pseudo-labelling for UDA often extends the source model to the target domain using the

source conﬁdence to select the best target training inputs [

]. However, the computation

of conﬁdence requires supervised training, which is only possible in UDA with the access

to the source data. In the case of SFUDA, as the domain gap increases, one can expect

a degrading representation capability of the source model on the target domain. Recent

advances in SSL shows that a good data representation can be learnt without annotated

labels [

]. In this work, we propose to exploit SSL techniques for a good starting

feature representation for the target model, and further propose to improve the reliability

of pseudo-labels with our newly introduced cluster purity,i.e. the local statistics of target

CONTI, ROTA, WANG, RICCI: CLUSTER-LEVEL PSEUDO-LABELLING 3

samples that are clustered within the feature space expressed by the source model. We validate

CluP on a set of cross-domain FER benchmarks and prove its advantageous performance in

terms of classiﬁcation accuracy.

We summarise our contributions as follows:

•

We present CluP, the ﬁrst method addressing Source-free Unsupervised Domain Adap-

tation for FER, exploiting SSL to foundation our target model.

•

CluP introduces a novel cluster-level pseudo-labelling scheme to improve the reliability

of pseudo-labels based on in-cluster attributes that deviates from traditional conﬁdence-

based pseudo-labelling methods.

•

We demonstrate that CluP surpasses competing methods for SFUDA and is comparable

with UDA techniques on several FER adaptation benchmarks.

2 Related work

In the following, we present recent works on UDA methods for FER, and some general-

purpose SFUDA solutions.

Unsupervised Domain Adaptation for FER.

As a consequence of the domain bias, quite

prominent among FER datasets, some works focus on domain adaptation [

]. In [

], Ji et al. apply late fusion on the outputs of two channels that learn intra-

category and inter-category similarities of facial expressions. The authors of [

] introduce

a locality preserving loss that draws samples of the same class closer. They also notice

that neighbouring samples in the embedding space present similar emotional intensities.

DETN [

] applies two variations of the Maximum Mean Discrepancy to assess the amount

of divergence between the domains and to re-weight the class-wise source distribution to

match the target. The authors extend the work in [

], where they additionally consider the

differences in the conditional distributions. Differently from the above works, AGRA [

]

focuses on the well-established approach of adversarial domain adaptation, leveraging facial

landmarks alongside facial images. For the landmarks, they introduce two specialised graph

neural networks while jointly considering the domain feature distributions, the local features

(i.e. , landmarks), and the holistic features, achieving the best results on many benchmarks.

Compared to previous works, we consider a stricter setting where the source data is

unavailable. We argue that, due to privacy issues, human behaviour understanding methods

do not always have access to the source data. For this reason, we introduce a novel method

for FER that adapt to a target domain in a source-free manner.

Source-Free Unsupervised Domain Adaptation.

Recently, novel methods for source-free

domain adaptation have been proposed [

]. The setting represents a more

complex but realistic scenario of UDA, where source data is unavailable. Some works resort

to entropy-minimisation losses to adapt to the target domain without labels. For example,

SHOT [

] employs an entropy loss alongside a classiﬁcation loss on pseudo-labelled samples

to adapt the network to the target domain. The work has been extended in [

] introducing

an auxiliary head that solves relative rotation, leading to improved performance. Differently

from the above, the authors of [

] frame the problem from an image translation perspective

and translate the target images to the source style using only the source model. In [

], they

perform self-training with a loss function that considers the intrinsic structure of the target

domain via nearest neighbours. In the proposed work, we do not impose any constraint on the

loss function, our reﬁnement step works on source clusters, and we propose a novel score

function to select the best samples to train on the target domain. Other works address open-set

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

CONTI,ROTA,WANG,RICCI:CLUSTER-LEVELPSEUDO-LABELLING1Cluster-levelpseudo-labellingforsource-freecross-domainfacialexpressionrecognitionAlessandroConti1alessandro.conti-1@unitn.itPaoloRota1,2paolo.rota@unitn.itYimingWang3ywang@fbk.euElisaRicci1,3e.ricci@unitn.it1DISI-DepartmentofInformationEngineering...

展开>> 收起<<

CONTI ROTA WANG RICCI CLUSTER-LEVEL PSEUDO-LABELLING 1 Cluster-level pseudo-labelling for source-free cross-domain facial expression.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

CONTI ROTA WANG RICCI CLUSTER-LEVEL PSEUDO-LABELLING 1 Cluster-level pseudo-labelling for source-free cross-domain facial expression

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: