Domain Generalization via Contrastive Causal Learning Qiaowei Miao1Junkun Yuan1Kun Kuang1 1Zhejiang University

2025-04-27 0 0 1.68MB 9 页 10玖币
侵权投诉
Domain Generalization via Contrastive Causal Learning
Qiaowei Miao,1Junkun Yuan, 1Kun Kuang 1
1Zhejiang University
qiaoweimiao@zju.edu.cn, yuanjk@zju.edu.cn, kunkuang@zju.edu.cn
Abstract
Domain Generalization (DG) aims to learn a model that can
generalize well to unseen target domains from a set of source
domains. With the idea of invariant causal mechanism, a lot of
efforts have been put into learning robust causal effects which
are determined by the object yet insensitive to the domain
changes. Despite the invariance of causal effects, they are
difficult to be quantified and optimized. Inspired by the abil-
ity that humans adapt to new environments by prior knowl-
edge, We develop a novel Contrastive Causal Model (CCM)
to transfer unseen images to taught knowledge which are
the features of seen images, and quantify the causal effects
based on taught knowledge. Considering the transfer is af-
fected by domain shifts in DG, we propose a more inclusive
causal graph to describe DG task. Based on this causal graph,
CCM controls the domain factor to cut off excess causal
paths and uses the remaining part to calculate the causal ef-
fects of images to labels via the front-door criterion. Specif-
ically, CCM is composed of three components: (i) domain-
conditioned supervised learning which teaches CCM the cor-
relation between images and labels, (ii) causal effect learn-
ing which helps CCM measure the true causal effects of im-
ages to labels, (iii) contrastive similarity learning which clus-
ters the features of images that belong to the same class and
provides the quantification of similarity. Finally, we test the
performance of CCM on multiple datasets including PACS,
OfficeHome, and TerraIncognita. The extensive experiments
demonstrate that CCM surpasses the previous DG methods
with clear margins.
Introduction
Humans have the ability to solve specific problems with the
help of previous knowledge. This generalization capability
helps humans take advantage of stable causal effects to adapt
to the environment shift. While deep learning has achieved
great success in a wide range of real-world applications, due
to the lack of out-of-distribution (OOD) generalization abil-
ity (Krueger et al. 2021; Sun et al. 2020; Zhang et al. 2021),
it suffers from a catastrophic performance degradation prob-
lem, especially when deployed in new environments with
changing distributions. Although Domain Adaptation (DA)
algorithms (Fu et al. 2021; Li et al. 2021b,c,d) support mod-
els to adapt to various target domains, different target do-
mains need corresponding domain adaptation processes. In
order to deal with domain shift problems, Domain General-
ization (DG) (Zhang et al. 2021; Chen et al. 2021; Liu et al.
2021; Sun et al. 2021; Mahajan, Tople, and Sharma 2021;
Wald et al. 2021) is introduced, which aims to learn stable
knowledge from multiple source domains and train a gener-
alizable model directly to unseen target domains.
Increasing works on DG have been proposed with a va-
riety of strategies like data augmentation (Carlucci et al.
2019; Wang et al. 2020; Zhou et al. 2020b,a, 2021), meta-
learning (Balaji, Sankaranarayanan, and Chellappa 2018; Li
et al. 2018a; Dou et al. 2019; Li et al. 2019a,b), invari-
ant representation learning (Zhao et al. 2020; Matsuura and
Harada 2020; Li et al. 2018d,c), et al. While promising per-
formance has been achieved by these methods, they might
try to model the statistical dependence between the input
features and the labels, hence could be biased by the spuri-
ous correlation in data (Liu et al. 2021). With the idea of in-
variant causal mechanism, increasing attention has been paid
to causality-inspired generalization learning (Wald et al.
2021; Liu et al. 2021; Sun et al. 2021; Mahajan, Tople, and
Sharma 2021). For example, Liu et al. (Liu et al. 2021)
introduce their causal semantic generative model to learn
semantic factors and variation factors separately via varia-
tional Bayesian. Sun et al. (Sun et al. 2021) further intro-
duce a domain variable and an unobserved confounder to
describe a latent causal invariant model. Mahajan et al. (Ma-
hajan, Tople, and Sharma 2021) consider high-level causal
features and domain-dependent features while the labels are
only determined by the former. Each of the casual graphs
given by them has a different focus, but there are common-
alities among them.
In this paper, we propose a causal graph to formalize the
DG problem from a novel perspective as shown in Figure 1
(a) and Figure 1 (b). To create a data sample in a certain en-
vironment, e.g., an image of a polar bear in the Arctic and
the corresponding label of the bear, We can guess from the
domain Dof the image that the object Ois a hardy animal
(i.e. DO). The hardy animal that lives in the Arctic may
be the polar bear (i.e. DCO). The domain factor
not only affects object Oand category C, but also provides
background features Eto the image X(i.e. DEX).
Combining the dual information of background Eand cate-
gory C, the image Xis captured and transformed into prior
knowledge (e.g. the seen images of brown bears) to predict
its label Y(i.e. XZY). And the label Yis deter-
mined based on the match between knowledge Zand cate-
arXiv:2210.02655v1 [cs.CV] 6 Oct 2022
(d)
(a)
(c)
(b)
(e)
X
SS
YY
VV
X
S
Y
V
X YY
DD OO
XA
XAXC
XC
X Y
D O
XAXC
Xns
Xns
Xs
Xs
EE
YYXX Xns
Xs
E
YX
ZZ
OO
CC
YYXX
EE
DD
Z
O
C
YX
E
D OODD
XX YY
CC
ZZ
EE
Figure 1: Comparisons between the causal graphs of CCM
(a), (b) and the previous methods (c) (Liu et al. 2021; Sun
et al. 2021), (d) (Mahajan, Tople, and Sharma 2021), and
(e) (Wald et al. 2021). In Figure (a), based on Figure (c),
Figure (d), and Figure (e), we add prior knowledge Zas a
bridge to link unseen image Xand label Yand make domain
Dpoint to object Oand category factor Cto explain the
limitations of source domain on models in DG. In Figure (b),
by controlling domain D, the remaining part is a standard
causal graph that can calculate causal effects from Xto Y
via the front-door criterion.
gory C. Compared to Figure 1 (c), Figure 1 (d) and Figure
1 (e), we add prior knowledge Zas a bridge to link unseen
image Xand label Y. The the relationships between domain
Dand other factors can be explained more clearly by figure
1 (b). In Figure 1 (a), the domain Das a confounder disturbs
models to learn causal effects from image Xto labels Y. So
we control domain Dto cut off DOand DE, and
the remaining part is a standard causal graph that can calcu-
late causal effects from Xto Yvia the front-door criterion,
as shown in Figure 1 (b).
To learn the causal effects of Xto Yshown in Figure 1
(b), we introduce the front-door criterion. It splits the causal
effects of P(Y|do(X)) into the estimation of three parts:
P(X),P(Z|X), and P(Y|Z, X). Furthermore, to permit
stable distribution estimation under causal learning, we fur-
ther design a contrastive training paradigm that calibrates
the learning process with the similarity of the current and
previous knowledge to strengthen true causal effects.
Our main contributions are summarized as follows. (i) We
develop a Contrastive Causal Model to transfer unseen im-
ages into taught knowledge that, and quantify the causal ef-
fects between images and labels based on taught knowledge.
(ii) We propose an inclusive causal graph that can explain
the inference of domain in the DG task. Based on this graph,
our model cuts off the excess causal paths and quantifies the
causal effects between images and labels via the front-door
criterion. (iii) Extensive experiments on public benchmark
datasets demonstrate the effectiveness and superiority of our
method.
Related Work
Domain Generalization
Domain generalization (DG) aims to learn from multiple
source domains a model that can perform well on unseen tar-
get domains. Data augmentation-based methods (Volpi et al.
2018; Shankar et al. 2018; Carlucci et al. 2019; Wang et al.
2020; Zhou et al. 2020b,a, 2021) try to improve the general-
ization robustness of the model by learning from the data
with novel distributions. Among them, some work (Volpi
et al. 2018; Shankar et al. 2018) generates new data based on
model gradient and leverages it to train a model for boost-
ing its robustness. While others (Wang et al. 2020; Carlucci
et al. 2019) introduce an interesting jigsaw puzzle strategy
that improves model out-of-distribution generalization via
self-supervised learning. Adversarial training (Zhou et al.
2020b,a) is also employed to generate data with various
styles yet consistent semantic information. Meta-learning
(Balaji, Sankaranarayanan, and Chellappa 2018; Li et al.
2018a; Dou et al. 2019; Li et al. 2019a,b) is also a popu-
lar topic in DG. The idea is similar to the problem setting
of DG: learning from the known and preparing for inference
from the unknown. However, it might not be easy to design
effective meta-learning strategies for training a generaliz-
able model. Another conventional direction is to perform in-
variant representation learning (Zhao et al. 2020; Matsuura
and Harada 2020; Li et al. 2018d,c). These methods try to
learn the feature representations that are discriminative for
the classification task but invariant to the domain changes.
For example, (Zhao et al. 2020) proposes conditional en-
tropy regularization to extract effective conditional invariant
feature representations. While favorable results have been
achieved by these approaches, they might try to model the
statistical dependence between the input features and the la-
bels, hence could be biased by the spurious correlation (Liu
et al. 2021).
Domain Generalization with Causality
In this paper, we assume the data is generated from the root
factors of the object Oand domain Das shown in Figure 1
(a). The class features Ccontrol both the input feature Xand
the label Y, meanwhile, the environment feature Eonly af-
fects X. We aim to learn an informative representation from
Xto predict Y. (Liu et al. 2021) proposes a causal semantic
generative model (see Figure 1 (c)). It separates the latent
semantic factor Sand variation factor Vfrom data, where
only the former causes the change in label Y. Similarly, (Sun
et al. 2021) introduces latent causal invariant models based
on the same causal model structure. Their semantic factor S
and variation factor Vare similar to the class feature Cand
the environment feature Ein our causal graph respectively,
while we further show their causal relationship with domain
and object. (Mahajan, Tople, and Sharma 2021) proposes a
causal graph with the domain Dand object Owhich is simi-
lar to ours, as shown in Figure 1 (d). It assumes that the input
feature Xis determined by causal feature XCand domain-
dependent feature XA, and the label Yis determined by XC.
Actually, the representation Z(Figure 1 (a)) that we aim to
learn is to capture the information of causal feature XC(Fig-
摘要:

DomainGeneralizationviaContrastiveCausalLearningQiaoweiMiao,1JunkunYuan,1KunKuang11ZhejiangUniversityqiaoweimiao@zju.edu.cn,yuanjk@zju.edu.cn,kunkuang@zju.edu.cnAbstractDomainGeneralization(DG)aimstolearnamodelthatcangeneralizewelltounseentargetdomainsfromasetofsourcedomains.Withtheideaofinvariantca...

展开>> 收起<<
Domain Generalization via Contrastive Causal Learning Qiaowei Miao1Junkun Yuan1Kun Kuang1 1Zhejiang University.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:1.68MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注