Domain Generalization via Contrastive Causal Learning Qiaowei Miao1Junkun Yuan1Kun Kuang1 1Zhejiang University

2025-04-27 0 0 1.68MB 9 页 10玖币

侵权投诉

Domain Generalization via Contrastive Causal Learning

Qiaowei Miao,1Junkun Yuan, 1Kun Kuang 1

1Zhejiang University

qiaoweimiao@zju.edu.cn, yuanjk@zju.edu.cn, kunkuang@zju.edu.cn

Abstract

Domain Generalization (DG) aims to learn a model that can

generalize well to unseen target domains from a set of source

domains. With the idea of invariant causal mechanism, a lot of

efforts have been put into learning robust causal effects which

are determined by the object yet insensitive to the domain

changes. Despite the invariance of causal effects, they are

difﬁcult to be quantiﬁed and optimized. Inspired by the abil-

ity that humans adapt to new environments by prior knowl-

edge, We develop a novel Contrastive Causal Model (CCM)

to transfer unseen images to taught knowledge which are

the features of seen images, and quantify the causal effects

based on taught knowledge. Considering the transfer is af-

fected by domain shifts in DG, we propose a more inclusive

causal graph to describe DG task. Based on this causal graph,

CCM controls the domain factor to cut off excess causal

paths and uses the remaining part to calculate the causal ef-

fects of images to labels via the front-door criterion. Specif-

ically, CCM is composed of three components: (i) domain-

conditioned supervised learning which teaches CCM the cor-

relation between images and labels, (ii) causal effect learn-

ing which helps CCM measure the true causal effects of im-

ages to labels, (iii) contrastive similarity learning which clus-

ters the features of images that belong to the same class and

provides the quantiﬁcation of similarity. Finally, we test the

performance of CCM on multiple datasets including PACS,

OfﬁceHome, and TerraIncognita. The extensive experiments

demonstrate that CCM surpasses the previous DG methods

with clear margins.

Introduction

Humans have the ability to solve speciﬁc problems with the

help of previous knowledge. This generalization capability

helps humans take advantage of stable causal effects to adapt

to the environment shift. While deep learning has achieved

great success in a wide range of real-world applications, due

to the lack of out-of-distribution (OOD) generalization abil-

ity (Krueger et al. 2021; Sun et al. 2020; Zhang et al. 2021),

it suffers from a catastrophic performance degradation prob-

lem, especially when deployed in new environments with

changing distributions. Although Domain Adaptation (DA)

algorithms (Fu et al. 2021; Li et al. 2021b,c,d) support mod-

els to adapt to various target domains, different target do-

mains need corresponding domain adaptation processes. In

order to deal with domain shift problems, Domain General-

ization (DG) (Zhang et al. 2021; Chen et al. 2021; Liu et al.

2021; Sun et al. 2021; Mahajan, Tople, and Sharma 2021;

Wald et al. 2021) is introduced, which aims to learn stable

knowledge from multiple source domains and train a gener-

alizable model directly to unseen target domains.

Increasing works on DG have been proposed with a va-

riety of strategies like data augmentation (Carlucci et al.

2019; Wang et al. 2020; Zhou et al. 2020b,a, 2021), meta-

learning (Balaji, Sankaranarayanan, and Chellappa 2018; Li

et al. 2018a; Dou et al. 2019; Li et al. 2019a,b), invari-

ant representation learning (Zhao et al. 2020; Matsuura and

Harada 2020; Li et al. 2018d,c), et al. While promising per-

formance has been achieved by these methods, they might

try to model the statistical dependence between the input

features and the labels, hence could be biased by the spuri-

ous correlation in data (Liu et al. 2021). With the idea of in-

variant causal mechanism, increasing attention has been paid

to causality-inspired generalization learning (Wald et al.

2021; Liu et al. 2021; Sun et al. 2021; Mahajan, Tople, and

Sharma 2021). For example, Liu et al. (Liu et al. 2021)

introduce their causal semantic generative model to learn

semantic factors and variation factors separately via varia-

tional Bayesian. Sun et al. (Sun et al. 2021) further intro-

duce a domain variable and an unobserved confounder to

describe a latent causal invariant model. Mahajan et al. (Ma-

hajan, Tople, and Sharma 2021) consider high-level causal

features and domain-dependent features while the labels are

only determined by the former. Each of the casual graphs

given by them has a different focus, but there are common-

alities among them.

In this paper, we propose a causal graph to formalize the

DG problem from a novel perspective as shown in Figure 1

(a) and Figure 1 (b). To create a data sample in a certain en-

vironment, e.g., an image of a polar bear in the Arctic and

the corresponding label of the bear, We can guess from the

domain Dof the image that the object Ois a hardy animal

(i.e. D→O). The hardy animal that lives in the Arctic may

be the polar bear (i.e. D→C←O). The domain factor

not only affects object Oand category C, but also provides

background features Eto the image X(i.e. D→E→X).

Combining the dual information of background Eand cate-

gory C, the image Xis captured and transformed into prior

knowledge (e.g. the seen images of brown bears) to predict

its label Y(i.e. X→Z→Y). And the label Yis deter-

mined based on the match between knowledge Zand cate-

arXiv:2210.02655v1 [cs.CV] 6 Oct 2022

(d)

(a)

(c)

(b)

(e)

X YY

DD OO

XAXC

X Y

D O

XAXC

Xns

YYXX Xns

YYXX

D OODD

XX YY

Figure 1: Comparisons between the causal graphs of CCM

(a), (b) and the previous methods (c) (Liu et al. 2021; Sun

et al. 2021), (d) (Mahajan, Tople, and Sharma 2021), and

(e) (Wald et al. 2021). In Figure (a), based on Figure (c),

Figure (d), and Figure (e), we add prior knowledge Zas a

bridge to link unseen image Xand label Yand make domain

Dpoint to object Oand category factor Cto explain the

limitations of source domain on models in DG. In Figure (b),

by controlling domain D, the remaining part is a standard

causal graph that can calculate causal effects from Xto Y

via the front-door criterion.

gory C. Compared to Figure 1 (c), Figure 1 (d) and Figure

1 (e), we add prior knowledge Zas a bridge to link unseen

image Xand label Y. The the relationships between domain

Dand other factors can be explained more clearly by ﬁgure

1 (b). In Figure 1 (a), the domain Das a confounder disturbs

models to learn causal effects from image Xto labels Y. So

we control domain Dto cut off D→Oand D→E, and

the remaining part is a standard causal graph that can calcu-

late causal effects from Xto Yvia the front-door criterion,

as shown in Figure 1 (b).

To learn the causal effects of Xto Yshown in Figure 1

(b), we introduce the front-door criterion. It splits the causal

effects of P(Y|do(X)) into the estimation of three parts:

P(X),P(Z|X), and P(Y|Z, X). Furthermore, to permit

stable distribution estimation under causal learning, we fur-

ther design a contrastive training paradigm that calibrates

the learning process with the similarity of the current and

previous knowledge to strengthen true causal effects.

Our main contributions are summarized as follows. (i) We

develop a Contrastive Causal Model to transfer unseen im-

ages into taught knowledge that, and quantify the causal ef-

fects between images and labels based on taught knowledge.

(ii) We propose an inclusive causal graph that can explain

the inference of domain in the DG task. Based on this graph,

our model cuts off the excess causal paths and quantiﬁes the

causal effects between images and labels via the front-door

criterion. (iii) Extensive experiments on public benchmark

datasets demonstrate the effectiveness and superiority of our

method.

Related Work

Domain Generalization

Domain generalization (DG) aims to learn from multiple

source domains a model that can perform well on unseen tar-

get domains. Data augmentation-based methods (Volpi et al.

2018; Shankar et al. 2018; Carlucci et al. 2019; Wang et al.

2020; Zhou et al. 2020b,a, 2021) try to improve the general-

ization robustness of the model by learning from the data

with novel distributions. Among them, some work (Volpi

et al. 2018; Shankar et al. 2018) generates new data based on

model gradient and leverages it to train a model for boost-

ing its robustness. While others (Wang et al. 2020; Carlucci

et al. 2019) introduce an interesting jigsaw puzzle strategy

that improves model out-of-distribution generalization via

self-supervised learning. Adversarial training (Zhou et al.

2020b,a) is also employed to generate data with various

styles yet consistent semantic information. Meta-learning

(Balaji, Sankaranarayanan, and Chellappa 2018; Li et al.

2018a; Dou et al. 2019; Li et al. 2019a,b) is also a popu-

lar topic in DG. The idea is similar to the problem setting

of DG: learning from the known and preparing for inference

from the unknown. However, it might not be easy to design

effective meta-learning strategies for training a generaliz-

able model. Another conventional direction is to perform in-

variant representation learning (Zhao et al. 2020; Matsuura

and Harada 2020; Li et al. 2018d,c). These methods try to

learn the feature representations that are discriminative for

the classiﬁcation task but invariant to the domain changes.

For example, (Zhao et al. 2020) proposes conditional en-

tropy regularization to extract effective conditional invariant

feature representations. While favorable results have been

achieved by these approaches, they might try to model the

statistical dependence between the input features and the la-

bels, hence could be biased by the spurious correlation (Liu

et al. 2021).

Domain Generalization with Causality

In this paper, we assume the data is generated from the root

factors of the object Oand domain Das shown in Figure 1

(a). The class features Ccontrol both the input feature Xand

the label Y, meanwhile, the environment feature Eonly af-

fects X. We aim to learn an informative representation from

Xto predict Y. (Liu et al. 2021) proposes a causal semantic

generative model (see Figure 1 (c)). It separates the latent

semantic factor Sand variation factor Vfrom data, where

only the former causes the change in label Y. Similarly, (Sun

et al. 2021) introduces latent causal invariant models based

on the same causal model structure. Their semantic factor S

and variation factor Vare similar to the class feature Cand

the environment feature Ein our causal graph respectively,

while we further show their causal relationship with domain

and object. (Mahajan, Tople, and Sharma 2021) proposes a

causal graph with the domain Dand object Owhich is simi-

lar to ours, as shown in Figure 1 (d). It assumes that the input

feature Xis determined by causal feature XCand domain-

dependent feature XA, and the label Yis determined by XC.

Actually, the representation Z(Figure 1 (a)) that we aim to

learn is to capture the information of causal feature XC(Fig-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DomainGeneralizationviaContrastiveCausalLearningQiaoweiMiao,1JunkunYuan,1KunKuang11ZhejiangUniversityqiaoweimiao@zju.edu.cn,yuanjk@zju.edu.cn,kunkuang@zju.edu.cnAbstractDomainGeneralization(DG)aimstolearnamodelthatcangeneralizewelltounseentargetdomainsfromasetofsourcedomains.Withtheideaofinvariantca...

展开>> 收起<<

Domain Generalization via Contrastive Causal Learning Qiaowei Miao1Junkun Yuan1Kun Kuang1 1Zhejiang University.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Domain Generalization via Contrastive Causal Learning Qiaowei Miao1Junkun Yuan1Kun Kuang1 1Zhejiang University

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: