1
Constrained Maximum Cross-Domain Likelihood
for Domain Generalization
Jianxin Lin, Yongqiang Tang, Junping Wang and Wensheng Zhang
Abstract—As a recent noticeable topic, domain generalization
aims to learn a generalizable model on multiple source domains,
which is expected to perform well on unseen test domains. Great
efforts have been made to learn domain-invariant features by
aligning distributions across domains. However, existing works
are often designed based on some relaxed conditions which are
generally hard to satisfy and fail to realize the desired joint
distribution alignment. In this paper, we propose a novel domain
generalization method, which originates from an intuitive idea
that a domain-invariant classifier can be learned by minimizing
the KL-divergence between posterior distributions from different
domains. To enhance the generalizability of the learned classifier,
we formalize the optimization objective as an expectation com-
puted on the ground-truth marginal distribution. Nevertheless,
it also presents two obvious deficiencies, one of which is the
side-effect of entropy increase in KL-divergence and the other
is the unavailability of ground-truth marginal distributions. For
the former, we introduce a term named maximum in-domain
likelihood to maintain the discrimination of the learned domain-
invariant representation space. For the latter, we approximate the
ground-truth marginal distribution with source domains under a
reasonable convex hull assumption. Finally, a Constrained Max-
imum Cross-domain Likelihood (CMCL) optimization problem
is deduced, by solving which the joint distributions are naturally
aligned. An alternating optimization strategy is carefully designed
to approximately solve this optimization problem. Extensive ex-
periments on four standard benchmark datasets, i.e., Digits-DG,
PACS, Office-Home and miniDomainNet, highlight the superior
performance of our method.
Index Terms—Domain generalization, domain adaptation, dis-
tribution shift, domain-invariant representation, joint distribu-
tion alignment.
I. INTRODUCTION
DEEP learning methods have achieved remarkable success
in computer vision tasks under the assumption that train
data and test data follow the same distribution. Unfortunately,
this important assumption does not hold in real-world applica-
tions [1]. The distribution shift between train data and test data,
which are widespread in various vision tasks, is unpredictable
and not even static, thus hindering the application of deep
learning in reliability-sensitive scenarios. For example, in the
field of medical image processing, image data from different
hospitals follow different distributions due to discrepancies in
J. Lin, J. Wang and W. Zhang are with the Research Center of Preci-
sion Sensing and Control, Institute of Automation, Chinese Academy of
Sciences, Beijing, 100190, China, and also with the School of Artificial
Intelligence, University of Chinese Academy of Sciences, Beijing, 100049,
China (e-mail: linjianxin2020@ia.ac.cn; junping.wang@ia.ac.cn; zhangwen-
shengia@hotmail.com).
Y. Tang is with the Research Center of Precision Sensing and Control,
Institute of Automation, Chinese Academy of Sciences, Beijing, 100190,
China (e-mail: yongqiang.tang@ia.ac.cn).
imaging protocol, device vendors and patient populations [2].
Hence, the models trained on data from one hospital often
suffer from performance degradation when tested in another
hospital owing to the distribution shift.
To tackle the distribution shift problem, considerable efforts
have been made in domain adaptation and domain general-
ization. Domain adaptation assumes that the target domain
is accessible and attempt to align the distributions between
the source domain and the target domain. However, in the
setting of domain adaptation, the model inevitably needs to be
retrained when the distribution of the target domain changes,
which can be time-consuming and cumbersome [3]. More
importantly, in many cases, there is no way to access the
target domain in advance. Fortunately, domain generalization
has been proposed to improve the generalization ability of
models in out-of-distribution scenarios given multiple source
domains, where the target domain is inaccessible [4].
As an active research area, many domain generalization
methods have been proposed. Let Xdenote an input variable,
i.e., an image, Z=F(X)denote the feature extracted from
Xby a feature extractor F(·)and Ydenote an output variable
i.e., a label. An effective and general solution to domain gen-
eralization is learning a domain-invariant representation space
where the joint distribution P(Z, Y )across all source domains
keeps consistent [4], [5], [6], [7]. Along this line, some works
[4], [8] try to align the marginal distribution P(Z)among
domains assuming that the posterior distribution P(Y|Z)is
stable across domains. Problematically, there is no guarantee
that P(Y|Z)will be invariant when aligning P(Z)[9], [10].
Some methods [11] attempt to align the class-conditional
distribution P(Z|Y). According to P(Z, Y ) = P(Z|Y)P(Y),
only if the categorical distribution P(Y)keeps invariant
across domains, aligning the class-conditional distributions
could achieve domain-invariant joint distribution [7]. But this
requirement is difficult to meet in practical applications.
More recently, the domain-invariant classifier, or the in-
variant predictor, has attracted much interest [12], [13], [14],
[15], [16]. In essence, these works are performing posterior
distribution alignment. Invariant Risk Minimization (IRM)
[13] seeks an invariant causal predictor, which is a simul-
taneously optimal classifier for all environments (domains).
IRM is formalized as a hard-to-solve bi-leveled optimization
problem. The invariant causal predictor realizes the conditional
expectation E[Y|Z]alignment across domains. It is a coarse
posterior distribution alignment due to the insufficiency of
the conditioned expectation. Robey et al [9] propose a novel
definition of invariance called G-invariance, which requires
that the classifier should hold invariant prediction after X
arXiv:2210.04155v1 [cs.CV] 9 Oct 2022