Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer Junming Chen1Meirui Jiang2Qi Dou2Qifeng Chen1

2025-04-27 1 0 8.26MB 15 页 10玖币

侵权投诉

Federated Domain Generalization for Image Recognition via

Cross-Client Style Transfer

Junming Chen1*Meirui Jiang2*Qi Dou2Qifeng Chen1

1HKUST 2CUHK

{jchenfo, cqf}@ust.hk {mrjiang, qdou}@cse.cuhk.edu.hk

Abstract

Domain generalization (DG) has been a hot topic in im-

age recognition, with a goal to train a general model that

can perform well on unseen domains. Recently, federated

learning (FL), an emerging machine learning paradigm to

train a global model from multiple decentralized clients

without compromising data privacy, has brought new chal-

lenges and possibilities to DG. In the FL scenario, many

existing state-of-the-art (SOTA) DG methods become in-

effective because they require the centralization of data

from different domains during training. In this paper, we

propose a novel domain generalization method for image

recognition under federated learning through cross-client

style transfer (CCST) without exchanging data samples.

Our CCST method can lead to more uniform distributions

of source clients, and make each local model learn to

ﬁt the image styles of all the clients to avoid the differ-

ent model biases. Two types of style (single image style

and overall domain style) with corresponding mechanisms

are proposed to be chosen according to different scenar-

ios. Our style representation is exceptionally lightweight

and can hardly be used to reconstruct the dataset. The

level of diversity is also ﬂexible to be controlled with a

hyper-parameter. Our method outperforms recent SOTA

DG methods on two DG benchmarks (PACS, OfﬁceHome)

and a large-scale medical image dataset (Camelyon17) in

the FL setting. Last but not least, our method is orthogo-

nal to many classic DG methods, achieving additive perfor-

mance by combined utilization. Our code is available at:

https://chenjunming.ml/proj/CCST.

1. Introduction

Federated learning (FL) aims to train a machine learning

model on multiple decentralized local clients without ex-

plicitly exchanging data samples. This emerging technique

has triggered increasing research interest in recent years,

*Joint ﬁrst authors.

owing to its signiﬁcant applications in many real-world sce-

narios such as ﬁnance, healthcare, and edge computing [22].

The paradigm works in a way that each local client (e.g.,

hospital) learns from their local data and only aggregates

the model parameters at a speciﬁc frequency on the central

server to yield a global model.

One of the biggest challenges in FL is tackling the non-

identically and independently distributed (non-IID) data

across different clients. Although much progress has been

made on addressing non-IID issues in FL [29,28], most of

them only focus on improving the performance of internal

clients. Few papers focus on domain generalization in FL,

which is a crucial scenario considering the model general-

ization ability on a new client with unseen data distribution.

For example, it is important that a federated trained dis-

ease diagnose model by multiple hospitals can be directly

utilized by other new hospitals with a high accuracy, espe-

cially when they have few annotated data to train a good

model. DG aims to improve the test performance on un-

seen target domains with the model trained on multi-source

data. A prior work FedDG [33] proposes to exchange the

amplitude information in frequency domain cross clients

and utilize episodic learning to improve the performance

further. However, they are speciﬁc to medical image seg-

mentation tasks and consider the distribution shift across

medical imaging protocols, which remains unexplored for

larger domain gaps in the wild. In contrast, we aim to im-

prove the model generalization ability for image recognition

tasks, and our method is able to handle domain shifts from

small (cross-site medical images) to more signiﬁcant ones

like photos and sketches in the PACS dataset.

The FL scenario poses particular and new challenges to

DG: regarding each client as a domain with a speciﬁc style,

the data from each domain cannot be put together during

training, which violates the implicit requirement of many

DG methods. For example, meta-learning [26] and adver-

sarial domain invariant feature learning [27] both require

access to all the source domains at the same time, which

is not directly applicable in federated learning. In addition,

straightforward aggregating the parameters of local models

arXiv:2210.00912v1 [cs.CV] 3 Oct 2022

(μ1,σ1)

(μ2,σ2)

(μ3,σ3)

Origin data

Data generation Data generation Data generation

Style stats Style stats Style stats

(μ1,σ1)

(μ2,σ2)

(μ3,σ3)

Server

(μ1,σ1)

(μ2,σ2)

(μ3,σ3)

Shared Style Bank

Style extraction

Sample generation using

shared style bank

Client 1Client 2Client 3

Model

Origin data Origin data

Figure 1: Overview of our framework with style transfer across source clients using three different source styles on the PACS

dataset. We augment each source client data with styles of other two source clients.

may lead to a sub-optimal global model because the local

models are biased to different client styles. To solve those

problems, we propose a data-level cross-domain style trans-

fer (CCST) method that augments the data by using other

source domain styles with the style transfer technique. In

this way, each client will have styles of all the other source

clients, and thus all the local models will have the same

goal to ﬁt images with all the source styles, which avoids

the different local model biases that may compromise the

global model performance. Moreover, CCST is orthogonal

to other DG methods, and thus existing methods on central-

ized DG can also beneﬁt from a further accuracy boost.

Our CCST method for federated domain generalization

is general and compatible with any style transfer method

that satisﬁes two requirements: First, the style information

in the style transfer algorithm cannot be utilized to recon-

struct the dataset; Second, this style transfer method should

be an arbitrary style transfer per model method, which

means the style transfer model should be ready to transfer a

content image to arbitrary styles. Since there can be many

clients in federated learning, the style transfer model should

better have the ability to transfer all those styles without re-

training. Otherwise, the deployment cost will signiﬁcantly

increase since each client has to store various models locally

for different styles and even require retraining for unseen

styles. In our paper, we choose AdaIN [15], an effective

real-time arbitrary style transfer model to demonstrate the

effectiveness of our CCST framework. The style informa-

tion used in AdaIN is the moments (i.e., mean and variance)

of each pixel-level feature channel at a speciﬁc VGG layer,

which are extremely lightweight (two 512-dimensional vec-

tors) and do not contain spatial structural information about

the image content. Therefore, such style information is ef-

ﬁcient to be shared across clients and can hardly lead to

the reconstruction of the dataset. Further analysis could be

found in Section 4.4.

The overall framework of our method is shown in Fig-

ure 1. Each client is regarded as a domain with a domain-

speciﬁc style. Before training the image recognition model,

we ﬁrst compute the style information of images in each

source client. We design two types of styles that can be

shared: single image style and overall domain style, which

will be illustrated in detail in Section 3. Then the source

clients will upload their style information to the global

server and share them with all the source clients, which we

call style bank. Each source client utilizes the shared style

bank to perform style transfer on their local data, during

which a hyperparameter K is introduced to control the di-

versity level of our CCST process. Federated training will

begin after each source client ﬁnishes data augmentation,

and then the trained model will be directly tested on the un-

seen target client. Our contributions are summarized below:

(a) We propose a simple yet effective framework named

cross-client style transfer (CCST). Our approach achieves

new state-of-the-art generalization performance in FL set-

ting on two standard DG benchmarks (PACS [24], Ofﬁce-

Home [39]) and a large-scale medical image dataset (Came-

lyon17 [3]). (b) Two types of styles with corresponding

sharing mechanisms are proposed, named overall domain

style and single image style, which can be chosen accord-

ing to different circumstances. The diversity level of our

method is also ﬂexible to be adjusted. (c) The proposed

method is orthogonal to many other SOTA DG methods.

Therefore, our method can be readily applied to those DG

methods to have a further performance boost. We also study

the effectiveness of several SOTA DG methods when they

are applied in the FL setting for image recognition. (d) We

give an intuitive (Section 4.4) and experimental analysis

(Section A) on the privacy-preserving performance of our

style vectors to demonstrate that one can hardly reconstruct

the original images merely from the style vectors using the

generator from a SOTA GAN [32] in FL setting.

2. Related Work

Domain generalization. Domain generalization is a

popular research ﬁeld that aims to learn a model from

multiple source domains such that the model can gener-

alize on the unseen target domain. Many works are pro-

posed towards solving the domain shifts from various di-

rections under the centralized data setting. Those methods

can be divided into three categories [42], including ma-

nipulating data to enrich data diversity [18,44,37,47],

learning domain-invariant representations or disentangling

domain-shared and speciﬁc features to enhance the gen-

eralization ability of model [1,36,4,46] and exploiting

general learning strategies to promote generalizing capabil-

ity [26,17,7,8].

However, many of these methods require centralized

data of different domains, violating the local data preser-

vation in federated learning. Speciﬁcally, access for more

than one domain is needed to augment data or generate new

data in [37,18], domain invariant representation learning or

decomposing features is performed under the comparison

across domains [1,36,46] and some learning strategy based

methods utilize extra one domain for meta-update [26,7,8].

Nevertheless, some methods do not explicitly require cen-

tralized domains or can be adapted into federated learning

with minor changes. For example, MixStyle [47] can op-

tionally conduct the style randomization in a single domain

to augment data; [44] uses Fourier transformation to aug-

mentation that is free of sharing data; JiGen [4] proposes

a self-supervised task to enhance representation capability;

RSC [17] designs a learning strategy based on gradient op-

erations without explicit multi-domain requirements.

Federated / decentralized domain generalization. De-

spite many works on centralized domain generalization and

tackling non-IID issues in FL, there are few works address-

ing the DG problem in FL. FedDG [33] exchanges the am-

plitude information across images from different clients and

utilizes episodic learning to improve performance further.

However, it only focuses on the segmentation task with

superﬁcial domain shift in data, and its performance on

image recognition with larger domain shift remains unex-

plored. COPA [43] propose only aggregating the weights

for domain-invariant feature extractor and maintaining an

assemble of domain-speciﬁc classiﬁer heads to tackle the

decentralized DG. However, since COPA has to share clas-

siﬁer heads of all the clients locally and globally, it may

lead to privacy issues, heavier communication, and higher

test-time inference cost.

Neural style transfer. Neural style transfer (NST) aims

to transfer the style of an image to another content im-

age with its semantic structure reserved. The development

of NST has roughly gone through three stages: per-style-

per-model (PSPM), multiple-style-per-model (MSPM) and

arbitrary-style-per-model (ASPM) methods [20]. PSPM

methods [11,21,38] can only transfer a single style for each

trained model. MSPM methods [9,5,45,31] are able to

transfer multiple styles with a single trained model. How-

ever, PSPM and MSPM are expensive to deploy when too

many styles are required to be transferred in our setting.

ASPM [6,16,12,30] can transfer arbitrary styles to any

content images and is often faster than PSPM and MSPM,

which is more suitable for our scenario.

The ﬁrst ASPM method is proposed by Chen and

Schmidt [6], but it cannot achieve real-time. AdaIN [16]

is the ﬁrst real-time arbitrary style transfer method, which

utilizes the channel-wise mean and variance as style in-

formation. It performs de-stylization by normalizing the

VGG feature with its own style and then stylizes itself

by afﬁne transformation with the mean and variance of

the style image feature. Another real-time ASPM method

[12] is a follow-up work of CIN [10]. They change the

MSPM method CIN into an ASPM method by predicting

the afﬁne transformation parameters for each style image

through another style prediction network. However, the

level of style-content disentanglement of the predicted style

vector remains unknown, which may have privacy issue in

FL setting. Later, Li et al. [30] propose a universal style-

learning free ASPM method, which utilizes ZCA whitening

transform for de-stylization and coloring transform for style

transfer. However, this method is much slower than previ-

ous methods in practice. Therefore, we choose the neatest

and efﬁcient real-time ASPM method AdaIN as our style

transfer model in our framework.

3. Method

The core idea of our method is to let the distributed

clients have as similar data distribution as possible by in-

troducing styles of other clients into each of them via cross-

client style transfer without dataset leakage. Figure 3shows

the data distribution before and after our CCST method. In

this way, we can make the trained local models learn to ﬁt

all the source client styles and avoid aggregating the local

models biased to different styles. As a result, each client

can be regarded as a deep-all [4] setting, and the local mod-

els will have the same goal to ﬁt styles from all the source

clients. We propose two types of styles that can be chosen

to transfer: one is overall domain style, the other is single

image style. In the following sections, we will introduce

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

FederatedDomainGeneralizationforImageRecognitionviaCross-ClientStyleTransferJunmingChen1*MeiruiJiang2*QiDou2QifengChen11HKUST2CUHK{jchenfo,cqf}@ust.hk{mrjiang,qdou}@cse.cuhk.edu.hkAbstractDomaingeneralization(DG)hasbeenahottopicinim-agerecognition,withagoaltotrainageneralmodelthatcanperformwellonuns...

展开>> 收起<<

Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer Junming Chen1Meirui Jiang2Qi Dou2Qifeng Chen1.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Federated Domain Generalization for Image Recognition via Cross-Client Style Transfer Junming Chen1Meirui Jiang2Qi Dou2Qifeng Chen1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: